In our last post we said we were leaning towards MariaDB and PostgreSQL. We ended up choosing PostgreSQL as our database and have made solid progress with development. We have developed the structure for our database (schema) and now have a PostgreSQL database running on our server. So far the database choice looks to have been a good one and it really does integrate well with Python.
On the crawler side of things we have already developed a very early working version of the crawler: It now crawls PubMed using pre-defined keywords, gets the metadata of all the articles found with the keywords and saves the results into a test table in the database.
We had a meeting with Mika last week, where we discussed our schema and the code. This week we are going to make the code slightly less PubMed specific and we are also going to be improving the database schema.
All the best,
The ATRO Development Team