A compression problem
This is a problem set. Some of these are easy, others are far more difficult. The purpose of these problems sets are:
- to build your skill applying computational thinking to a problem
- to assess your knowledge and skills of different programming practices
What is this problem set trying to do[edit]
This is example problem set. In this example we are learning about lists, conditionals, and processing user input.
The Problem[edit]
I wrote a crawler that visits web pages[2], stores a few keywords in a database, and follows links to other web pages. I noticed that my crawler was wasting a lot of time visiting the same pages over and over, so I made a set, visited, where I'm storing URLs I've already visited. Now the crawler only visits a URL if it hasn't already been visited.
Thing is, the crawler is running on my old desktop computer in my parents' basement (where I totally don't live anymore), and it keeps running out of memory because visited is getting so huge.
How can I trim down the amount of space taken up by visited?
sites_ive_indexed = ('http://www.google.com','https://www.apple.com','https://mackenty.org','https://www.linode.com','https://allegro.pl')
How you will be assessed[edit]
Your solution will be graded using the following axis:
Scope
- To what extent does your code implement the features required by our specification?
- To what extent is there evidence of effort?
Correctness
- To what extent did your code meet specifications?
- To what extent did your code meet unit tests?
- To what extent is your code free of bugs?
Design
- To what extent is your code written well (i.e. clearly, efficiently, elegantly, and/or logically)?
- To what extent is your code eliminating repetition?
- To what extent is your code using functions appropriately?
Style
- To what extent is your code readable?
- To what extent is your code commented?
- To what extent are your variables well named?
- To what extent do you adhere to style guide?
References[edit]
A possible solution[edit]
Click the expand link to see one possible solution, but NOT before you have tried and failed!
not yet!