Hacking the Ehrenberg Collection

Jens Dobberthin

Von Jens Dobberthin
21.03.2022 | 2 Minuten Lesezeit

For the current edition of the cultural hackathon Coding da Vinci we have contributed over 3000 historical drawings of microorganisms by famous researcher Christian Gottfried Ehrenberg . Read more about how to use and remix it in this blog post.

<em>Daphnia pulex</em> as drawn by Ehrenberg
Daphnia pulex as drawn by Ehrenberg

To guide you through the dataset and to give you a head start we have set up three Jupyter notebooks.

Each notebook covers a different topic and teaches you how you can search, retrieve and enrich our dataset.

By going through these notebooks you also learn how to incorporate 3rd-party services to gain new insights and a full picture of the Ehrenberg drawings.

Here is the full list of the notebooks (only available in English):

Ehrenberg drawings (part 1): Taxonomy resolver

Extract and enrich specimen data from the Ehrenberg Collection dataset.

Level: Intermediate

See it at: Datalore

Ehrenberg drawings (part 2): Geocoding

Use named entity recognition (NER) to find places in description texts and show them on map.

Level: Intermediate

See it at: Datalore

Ehrenberg drawings (part 3): Querying Wikidata

Query the Wikidata API and use SPARQL to retrieve images related to extracted specimens

Level: Intermediate

See it at: Datalore

The notebooks and more information for developers are also available here .

Happy coding!