OpenRefine News: June 2015
Read below if you what happenned within OpenRefine community in June
Currently most of the communication around OpenRefine done is through the mailing list and our twitter account where information are quickly buried for someone not following the project on day to day basis. Those monthly summary highlights key events and contributions in the community and hopefully help to better circulate key information.
Feedback on the format and content are welcome. Ping us on twitter @OpenRefine if we are missing information.
New Tutorials and Articles
If you are new to OpenRefine, Alvin Chang published an excellent introduction to clustering maybe OpenRefine most appreciated feature with new users. For more advanced users, @UMBHLCuration published a tutorial on Normalizing Dates with OpenRefine.
At the Toronto OpenRefine UnConference, I presented on Iterative data discovery and transformation with OpenRefine explaining why OpenRefine is an essential tool for non technical subject matter expert when working with data. To dig further on the topic with the article Agile Data Process with OpenRefine.
If you are working in a library. Checkout the records of this month NCompass Live: Metadata Manipulations: Using MarcEdit and OpenRefine. @silviaegt also published his slides of hisDHBenelux talk with links to VIAF reconc. for OpenRefine & cartoDB.
Finally a great use case presented by @hpiedcoq and @jvilledieu: How to visualize your Facebook network with OpenRefine outwit and Neo4j.
We also have a new resource published in French and Italian:
- Créer un référentiel pour le #LOD avec #OpenRefine et #Gingo by @Invisu
- Tutorial per come farsi del geocoding in casa partendo da un elenco di indirizzi
Development Update
OpenRefine2.6 RC1
Over the last three months we received update to translate OpenRefine interface in Spanish and French. The This will complete the current English, Italian and Chineese version.
We are currently testing the 2.6 RC1 version as a quick checkpoint to allow us to verify all the fixes that have been made since beta 1 and figure out what remaining loose ends need to be cleaned up. More information on the developer mailing list
Reconciliation and Matching Framework (RMF)
Matthew Blissett with the support of the Royal Botanic Gardens in London are releasing Reconciliation and Matching Framework (RMF), a framework to allow the matching of string entities using customised sets of transformations and matchers, plus a tool to produce the necessary configurations and another to expose them as OpenRefine reconciliation services.
###GOKB annoucement:
GOKb, the Global Open Knowledgebase, is a community-managed project that aims to describe electronic journals and books, publisher packages, and platforms which host the resources. GOKb use OpenRefine (with a specially designed extension) as our major mechanism of getting data into GOKb - exploiting the ability to clean up the data (which tends to come from publishers and can be of variable quality) and to re-apply changes to future data from the same publisher/supplier.
GOKb opened to ‘public preview’ in January 2015, and you can signup for an account and access the service at https://gokb.kuali.org/gokb/
Several hundred ejournal packages, and associated information about the ejournal titles, platforms and organisations have been added to the knowledgebase over the past few months. OpenRefine is used to do much of the work to get data ready for loading into GOKb.
Alongside this work of adding content GOKb have also opened up APIs to interact with the service, which could be useful to others using OpenRefine to work with data relating to journals. In particular the ‘Coreference service’ allows you to look up identifiers (such as ISSNs) and get back journal title information and other IDs associated with that title (as JSON or XML).
They are interested in:
- Talking to people who use OpenRefine and would like to make use of such a service
- If there is some interest, what support/documentation people would like to see
- Understanding if we can offer different/better services based on the GOKb data for OpenRefine (e.g. would different data GOKb has be of interest? Would a reconciliation service for journal titles? etc.)
More details and join the discussion on the user mailing list
##Workshop and Events
OpenRefine Twitter feed have been busy last months with over 12 presentations of OpenRefine made! Thank you to our evangelists who introduce OpenRefine to librians, journalist, goverement and open date professional among other groups. Top hastag are:
Want to connect with fellow Refiners? The following events have been announced so far. Ping us on twitter @OpenRefine to announce your event:
- Several Cleaning & Exploring data w OpenRefine workshop available through July with #DH2015 in Australia.
- The Data Scientist Training for Librarians (DST4L) comes to Copenhagen with an all day OpenRefine session Sep 9:
- UMBHLCuration will be doing a workshop on structured data wrangling w/Python and OpenRefine for the midmichdp at Albion College on October 8th.