Searching for nested items within the dictionary data structure - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: General (https://python-forum.io/forum-1.html) +--- Forum: Code Review (https://python-forum.io/forum-46.html) +--- Thread: Searching for nested items within the dictionary data structure (/thread-38071.html) Pages:
1
2
|
Searching for nested items within the dictionary data structure - rob101 - Aug-30-2022 I've been studying the dictionary data structure as I wanted to discover a way of searching for nested items, so this code is a demonstration of that goal; I understand that it's possibly over engineered for the task that I have chosen. The user input is not fully sanitized, but (as you will see from the code comments) I have that covered by a custom function that I've already written (for the sake of brevity, that function is not included here). Also, (for the sake of brevity) I've only included a few books, but you can add as many as you like, for testing. I know not of any bugs, so if you find any or if you have any general comments about my coding style, I'm open to constructive criticism. Thank you reading and testing; I'll reply to any comments you may have, as and when. Enjoy and who knows, you may even find this to be the bases of a useful app. #!/usr/bin/python3 from sys import stdout library = { # list indexed as 0 for the book title and 1 for the book author 'computer science & programming':{ '0-13-110163-3':[ 'THE C PROGRAMMING LANGUAGE', 'BRIAN W.KERNIGHAN & DENNIS M.RITCHIE' ], '0-85934-229-8':[ 'PROGRAMMING IN QuickBASIC', 'N.KANTARIS' ], '0-948517-48-4':[ 'HiSoft BASIC VERSION 2: USER MANUAL', 'DAVID NUTKINS, ALEX KIERNAN and TONY KENDLE' ] }, 'reference':{ '0-333-34806-0':[ 'DICTIONARY OF INFORMATION TECHNOLOGY', 'DENNIS LONGLEY and MICHAEL SHAIN' ] }, 'novels':{ '0-681-40322-5':[ 'THE MORE THAN COMPLETE HITCHHIKER\'S GUIDE', 'DOUGLAS ADAMS' ] } } #===========<End of dictionary>===========# def search(publication, term): result = [] results = [] found = 0 maximum = 6 title = 0 author = 1 categories = library.keys() for category in library: for isbn in library[category]: book = library.get(category).get(isbn) if term[:4] == 'ISBN' and term[4:] == isbn: term = book[title] if term in book[publication]: found += 1 result.append(book[title]) result.append(book[author]) result.append(category) result.append(isbn) results.append(result) result = [] if found > maximum: break if found > maximum: break if found: if found > maximum: return maximum else: return results else: return #=========<End of search function>=========# def output(results, file=stdout): print("-"*50) if isinstance(results, list): for books in results: for book in books: print(book) print("-"*50) else: print("Search results exceeds the maximum of {}".format(results)) print("-"*50) #=========<End of output function>=========# find, found = None, None # attempt = the index reference passed to <if term in book[publication]> attempt = 0 # 0 = book title 1 = book author quit = False while not find and not quit: print(''' Search term must be alphanumeric characters only and greater than three characters in length. For a ISBN search, enter ISBN and press return. ''') find = input("Search term: ").strip().upper() # to-do: check the input with the user input checker function if find == 'QUIT': quit = True elif find =='ISBN': print("ISBN search") isbn = input("ISBN: ").strip() find = find+isbn if len(find) > 3: found = search(attempt, find) else: find, found = None, None if find and not quit: while not found and attempt < 1: # change this if more fields are added to the publication attempt +=1 found = search(attempt, find) if found: output(found) find, found = None, None attempt = 0 elif not quit: print("Nothing found") find = None attempt = 0 print("Search exit.") RE: Searching for nested items within the dictionary data structure - Gribouillis - Aug-30-2022 (Aug-30-2022, 08:55 AM)rob101 Wrote: Thank you reading and testing; I'll reply to any comments you may have, as and when.I'll try to look into this. A few remarks while reading:
RE: Searching for nested items within the dictionary data structure - rob101 - Aug-30-2022 (Aug-30-2022, 09:12 AM)Gribouillis Wrote: I'll try to look into this. A few remarks while reading: Thank you. I will update the code (above) in one hit, as and when any feedback that requires a code update, seems to be in. I have to admit (and as you've likely guessed) I'm not up to speed with your point 1 and point 2. As for point 3, yes; that's something that I should have taken care of and it will be. Point 1: By this, do you mean that I should have a 'driver' to simulate user input or am I barking up the wrong tree? Point 2: Could you (if you've time) give me a quick explainer as to why this is a good option to have and how that could be used? Function will be amended to: def output(results, file=sys.stdout) which is (as I understand it to be) the default. Point 3: Done. The code here will be updated as and when. With thanks and regards. RE: Searching for nested items within the dictionary data structure - Gribouillis - Aug-30-2022 (Aug-30-2022, 11:12 AM)rob101 Wrote: Point 1: By this, do you mean that I should have a 'driver' to simulate user input or am I barking up the wrong tree?In the end, it could be an option, but unit tests are made to test small «units» in a program, not the program as a whole. For example they test a function's behavior. Here is how you could start unit testing the output() function for example. I inserted the following code just before the find, found = None, None in your codeimport io import unittest class TestOutput(unittest.TestCase): def test_print_error_message_if_results_is_integer(self): results = 25 ofh = io.StringIO() output(results, file=ofh) s = ofh.getvalue() self.assertIn(f'exceeds the maximum of {results}', s) def test_output_contains_titles(self): results = [['ti0, ''au0', 'ca0', 'is0'], ['ti1', 'au1', 'ca1', 'is1']] ofh = io.StringIO() output(results, file=ofh) s = ofh.getvalue() self.assertIn('ti0', s) self.assertIn('ti1', s) if sys.argv[-1] == 'test': unittest.main(argv=sys.argv[:-1]) sys.exit(0)Now if instead of python program.py , you call python program.py test , it will run the tests instead of an interactive session.To make the output() function testable, I had to inject the file in its parameters, and this answer your second question: to make output function testable, you need to be able to inject the file object. I did it in a simple way hereimport functools import sys def output(results, file=sys.stdout): print = functools.partial(__builtins__.print, file=file) print("-"*50) if type(results) is not int: for books in results: for book in books: print(book) print("-"*50) else: print("Search results exceeds the maximum of {}".format(results)) print("-"*50) #=========<End of output function>=========#
rob101 Wrote:Point 3: Done. The code here will be updated as and when.You could perhaps upload the code to a site such as github gist which allows you to push updates of the code throw git like I did for this module for example, and leave a link in this thread so we could have the latest version at any time. RE: Searching for nested items within the dictionary data structure - rob101 - Aug-30-2022 (Aug-30-2022, 12:54 PM)Gribouillis Wrote: In the end, it could be an option, but unit tests are made to test small «units» in a program, not the program as a whole. This is all very helpful and I need to take a little time so that I can get my head around these new (to me) concepts and evaluate the code that you have posted, so that I fully understand what you've done, as well as why. (Aug-30-2022, 12:54 PM)Gribouillis Wrote: You could perhaps upload the code to a site such as github gist... This is an option that I will look into. In the mean time, I will update the code that's in my first post: I feel that it's maybe better to do that, than to have multiple versions sprinkled around this thread. Given that I have the output() function and that it can be 'unit tested' in the way that you demonstrate, I feel it could be better to have all the print() functions moved to the output() function, right? That is to say, the ones that are concerned with the search results, such as Nothing found A thought that's come to mind, as I type this: once testing has been done, is it 'best practice' to remove the code that facilitates said testing, or does one leave it as is? I feel it should be removed, as it plays no part in the functionality of the app, right? It's details such as this, that are of as much interest to me, as is writing the code. With that last thought in mind, I will refrain from including any of the code that is purely for testing, until I'm clear about what should and should not be included in the, shall we call it, release candidate. Thank you very much for your time, as well as the information, and I look forward to your next reply, as and when you have more time to do so. RE: Searching for nested items within the dictionary data structure - perfringo - Aug-30-2022 This is too verbose: categories = library.keys() for category in categories: books = library.get(category) for isbn in books: book = books.get(isbn) # do something with bookIf you iterate over dictionary then you iterate over keys. So you can reduce this to: for category in library: for record in library[category]: # do something with library[category][record]Which raises the question about the way the data is structured. If I get some data from upstream my first action is to check whether I should convert it to make it simpler (and faster) to work with. In this particular case list of dictionaries could be one possibility - very simple and generic filtering function could deliver all required functionality. Current code iterates over all the data, so there should not be any performance penalty as well. Another possibility is to use dataframe and take advantage of vectorization. RE: Searching for nested items within the dictionary data structure - rob101 - Aug-30-2022 (Aug-30-2022, 02:17 PM)perfringo Wrote: This is too verbose:... Yes, it does. I'm not one for any nested data structured, if it can be avoided, but keep in mind that this is an academic exercise for me, just because I wanted to learn how one would go about searching such a data structure, if one needed to. If I was implementing a way to store and search a book collection, I would not use this code, as there are much simpler ways in which that can be done. I will have a look at the improvement that you've posted, for which I am grateful, as I'm sure it will be better and I will be able to apply what you've shown me. With thanks and regards. To add... (Aug-30-2022, 02:17 PM)perfringo Wrote: If you iterate over dictionary then you iterate over keys. So you can reduce this to: I've run a test and from what I can see, your improvement will work for me: for category in library: for isbn in library[category]: book = library.get(category).get(isbn) if term[:4] == 'ISBN' and term[4:] == isbn: # do the rest of the search from here... which I will implement and update (unforeseen issues aside). Thank you. RE: Searching for nested items within the dictionary data structure - Larz60+ - Aug-30-2022 For what it's worth, Here's how I generically display nested dictionaries. def display_dict(dictname, indentwidth=0): indent = " " * (4 * indentwidth) for key, value in dictname.items(): if isinstance(value, dict): print(f'\n{indent}{key}') indentwidth += 1 display_dict(value, indentwidth) else: print(f'{indent}{key}: {value}') if indentwidth > 0: indentwidth -= 1 def testit(): urllist = { "LocalGovernment": { "Argentina": { "MisionesOpenData_AR": "http://www.datos.misiones.gov.ar/" }, "Austria": { "ViennaOpenData_AT": "https://www.data.gv.at/" }, "UnitedStates": { "alabama": { "Alabaster": { "Rank": 16, "URL": "https://www.cityofalabaster.com/", "Population": "33,373" }, "Albertville": { "Rank": 27, "URL": "https://www.cityofalbertville.com/", "Population": "21,620" } } } } } display_dict(urllist) if __name__ == '__main__': testit()
RE: Searching for nested items within the dictionary data structure - Gribouillis - Aug-30-2022 (Aug-30-2022, 08:13 PM)Larz60+ Wrote: Here's how I generically display nested dictionaries.Or using module asciitree from Pypi import asciitree class OurTraversal(asciitree.traversal.Traversal): def get_children(self, node): k, v = node return list(v.items()) if isinstance(v, dict) else [] def get_root(self, tree): return tree def get_text(self, node): k, v = node return k if isinstance(v, dict) else f'{k}: {v}' def testit(): urllist = { "LocalGovernment": { "Argentina": { "MisionesOpenData_AR": "http://www.datos.misiones.gov.ar/" }, "Austria": { "ViennaOpenData_AT": "https://www.data.gv.at/" }, "UnitedStates": { "alabama": { "Alabaster": { "Rank": 16, "URL": "https://www.cityofalabaster.com/", "Population": "33,373" }, "Albertville": { "Rank": 27, "URL": "https://www.cityofalbertville.com/", "Population": "21,620" } } } } } s = str(asciitree.LeftAligned(traverse=OurTraversal())(('', urllist))) print(s) if __name__ == '__main__': testit()
RE: Searching for nested items within the dictionary data structure - rob101 - Aug-30-2022 (Aug-30-2022, 08:13 PM)Larz60+ Wrote: For what it's worth, Here's how I generically display nested dictionaries. Thank you for that. This... for key, value in dictname.items(): if isinstance(value, dict):... looks very interesting. I'd not considered accessing the keys directly, in a for loop, with the .items() method, together with the isinstance() function. I'll certainly look at that usage, for my own understanding, least ways.
|