Creating a Personal Archive of a 12-week boot camp in only two Blocks of code?

“I choose a lazy person to do a hard job. Because a lazy person will find an easy way to do it.” Not Bill Gates?

Become a Data Scientist in 12 weeks, with no prior coding experience required, open to anyone willing to dedicate themselves to a rigorous, challenging experience? Sign me up.

Photo by Mantas Hesthaven on Unsplash

To think of all the knowledge we have gained over the past 12 weeks, I know my poor mind will struggle to retain each detail of how to hyper tune each model or very in-depth understanding of A/B testing. I did have a small running list of key facts to keep track of after this program:

Short 6-page document of things I hope to remember, this lasted about a week of detailed note-taking.

However, I knew this was not going to be able to reflect all of the information we had access to including Appendix information I didn’t personally touch until 7 weeks into the program.

Same, this was just too much.

Therefore I looked for a way to preserve all of this information as a personal archive to ensure rather than fighting to understand a Stack-OverFlow post to see if it is what I need I can reference this material. A few problems arose when I started on this project:

  1. First, this material is owned in full by the Company that provided it to me as a student, through a Learning Management System (Canvas) and there is no clear way to capture each lesson in a reviewable way.
  2. Second, some of the lessons and almost all Labs are housed through a secondary platform call IllumiDesk, which provides a SaaS Jupyter notebook through a browser. The root of each lesson can be found on a provided GitHub page which was shown to be part of the solution.
  3. Third, if I could somehow come up with a process that did work for archiving this material in a reviewable way how much personal time would that require, and how annoying would it be? Well for anyone that has a repetitive task 100 times in a row, very annoying is an understatement.
Feels

So as all greats do I tackled each problem piece by piece and found some way to make it lazy, hints the title quote. I found that I could preserve each lesson as a pdf using Google Chrome's print page feature (“Ctrl” + “P” when using Chrome on Windows) which then could be saved locally to create a personal archive of the material.
Success!
This also happened to be the solution to the Lab issues described during problem two, since I could locate each lesson as a ReadME file on Github and some labs even with Solutions.
Double Success!!!
Here comes the problem, who is going to go through each phase and capture each lesson as a pdf and store them on a local computer.

Not me…

This is where PyAutoGUI comes into play. This library allows the coder to mimic mouse movements to complete tasks, their document page includes a youtube video of code playing Sushi Go Round. It is worth a watch if you are really interested in what this library can do in the right hands. In my case I just wanted a code to complete this printing task a set number of times. Below I will show the code which worked on my laptop to complete this task and some of the assumptions needed to mimic the code:

Running code that allows the computer to go through a number of Canvas Pages, 2 in this example.

Assumptions:

  • Some prior code needs to function before this will work:
Location of the Canvas tab, and a check that the Github Page is needed (Check must return a box)

This allows the checks within the code to decide which page it is looking at and how to complete the task at hand.

  • The user needs to be using Google Chrome with a pinned task (Slack in my case) and the Canvas website as the first full tab. This allows the mouse to find the correct tag.
  • You have two png files located in the same folder as the Jupyter Notebook, this is an image of the Next button
Such as this

at the bottom of the Canvas page and a snip of something that is unique on the Lab assignment page. I selected

Example image

as my snip of choice, since it was always found on these pages but not on any other assignment pages.

  • Lastly, Google Chrome’s Print page needs to be set to “Save to PDF” and the destination folder needs to have the save buttons overlapping so the clicker will complete two clicks to save the PDF in the correct place.

This can be set to run through all the lessons and labs through the Appendix so you too can have a copy of this knowledge without having to Google every “simple” question. This did not take into account any quizzes from the first 3 weeks of this course so the code will just take a PDF of the page, this would have to either be skipped in an else-if statement or deleted from the archive after the collection is complete.

Please note that you may have to adjust some of the code to match your computer’s monitor specifications and processing speed but I will leave that debugging for you to play with.

Photo by Illumination Marketing on Unsplash

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Joshua Hill

Joshua Hill

Aspiring Data Scientist looking to refine my craft through thoughtful Blog posts. See personal Projects on Github.com-Joshua-Hill-Science.