Hacking the Ehrenberg Collection

Jens Dobberthin

By Jens Dobberthin
21.03.2022 | 2 minutes reading time

For the current edition of the cultural hackathon Coding da Vinci we have contributed over 3000 historical drawings of microorganisms by famous researcher Christian Gottfried Ehrenberg. Read more about how to use and remix it in this blog post.

<em>Daphnia pulex</em> as drawn by Ehrenberg
Daphnia pulex as drawn by Ehrenberg

To guide you through the dataset and to give you a head start we have set up three Jupyter notebooks.

Each notebook covers a different topic and teaches you how you can search, retrieve and enrich our dataset.

By going through these notebooks you also learn how to incorporate 3rd-party services to gain new insights and a full picture of the Ehrenberg drawings.

Here is the full list of the notebooks (only available in English):

Ehrenberg drawings (part 1): Taxonomy resolver

Extract and enrich specimen data from the Ehrenberg Collection dataset.

Level: Intermediate

See it at: Datalore

Ehrenberg drawings (part 2): Geocoding

Use named entity recognition (NER) to find places in description texts and show them on map.

Level: Intermediate

See it at: Datalore

Ehrenberg drawings (part 3): Querying Wikidata

Query the Wikidata API and use SPARQL to retrieve images related to extracted specimens

Level: Intermediate

See it at: Datalore

The notebooks and more information for developers are also available here.

Happy coding!