The latest in a long line of BBC projects undertaken, the public broadcaster, with the history, funds, and freedom to do a little more than their commercial competitors (rightly or wrongly), have launched the ‘Genome Project‘, which aims to archive every single TV and radio broadcast that has ever been made in the UK under the BBC banner.
Whilst that is a huge ask considering the BBC’s legacy (with even their best known shows being sacrificed or going missing before an improvement in archiving a few decades ago), the venture aims to go from 1923 onwards with support of the online community in the ‘curation and chronicling’ of every BBC broadcast… ever.
Currently on the official website, while without much content-wise, have put up ‘digital editions’ of each issue of the Radio Times magazine that has been published between 1923-2009, meaning that aside from late schedule changes, users can see what was on and when at various points in modern history.
Aiming to have a Wikipedia-style operation in which suggestions and changes can be made by its readers, the BBC have opened up the project to users with ‘Genome’ keeping moderation on the contributions, whilst official audio/video clips are to be added in relevant places over time by the broadcaster and by public users, as they aim to find what was previously lost where possible.
An official blog post on the matter stated:
“Genome – the BBC project to digitise the Radio Times magazines between 1923 and 2009 is now live. On the site you can find BBC broadcast information – ‘listings’ – extracted from those editions. You can also search individual programme titles, contributors and synopsis information.
Our aim on this project is to curate a comprehensive history of every radio and TV programme ever broadcast by the corporation, and make that available to the public. Our first step has been this digitisation of the BBC radio and TV programme schedules from the Radio Times magazine; the next phase of the project is to incorporate what was actually broadcast, as well as the regional and national variations. It’s one of the most important steps we’re taking to begin unlocking the BBC’s archive, as Genome is the closest we currently have to a comprehensive broadcast history of the BBC.
We’re really pleased to get the site live, not least because so many of you have been asking “when”, “how soon” and telling us “how useful it would be”. The challenges in making available the 4.42 million programme records so far have been significant – you can read about some of the recent ones on the Internet blog.
We need your help too though. We’re looking to you to help us to clean up the data. The scanning process – known as ‘Optical Character Recognition’ – has produced plenty of errors: punctuation in the wrong places, spaces where there shouldn’t be any or no spaces where there should, as well as fundamental misunderstandings about who did what.
We’ve made it possible for you to submit an edit to us, as you use the site. We’ll validate your suggested changes and publish the ones which are approved.
We’ve also included a ‘Tell Us More’ form, at the bottom of each programme listing, so we can tap into the collective memory, insight and knowledge of our users, making use of the wealth of experience out there about our programmes, something we’d like to capture.
We also know that the schedule changed considerably on occasion, because of events in the real world and we need that information too.
Additionally, during the process of building Genome, we’ve identified a few ‘chunks’ of data that are missing from the database, but due to the way in which OCR works, didn’t get picked up in the original scans. So, we will be adding this in.
The Radio Times has been published with regional variations since 1926. The magazines we scanned and the data sets which have been included in Genome are not exhaustive, rather they represent the ones which we could access and which covered the greatest areas and variations. In the future, we will look into the implications of attempting provide a more complete set of regional data.
We won’t be able to reflect what you send us straight away, but as we build on BBC’s Genome, it will come in to its own.
Now that we have published the planned broadcast schedule, our next step is to match the records in our archive catalogue (the programmes that we have a copy of in our physical archives) with the Genome programme listings. This helps us identify what proportion of the broadcasts exist in a potentially ‘playable’ form, and highlights the gaps in our archive.
It is highly likely that somewhere out there, in lofts, sheds and basements across the world, many of these ‘missing’ programmes will have been recorded and kept by generations of TV and radio fans. So we’re hoping to use Genome as a way of bringing copies of those lost programmes back in to the BBC archives too.
But, even if we don’t have an actual copy of the programme, we’ll also look to publish related items in our archives, such as scripts, photographs and associated paper-work. We’re looking in to the logistics of making some of these items available via Genome. Clearly, this will in some cases be a long and painstaking task. The BBC’s various archives contain millions of items spread over 23 archive centres across the UK, most of them in analogue form. It’s a big job, one we’re looking forward to reporting back on in the future.
What happens after 2009 when the Genome data “stops”? Well the information held at www.bbc.co.uk/programmes starts in 2007 (the birth of the iPlayer) and as the Genome data is improved and corrected (by you!), we expect to start ‘backfilling’ the bbc.co.uk/programme pages with the Genome data.”
A unique venture that will be unmatched around the world at least in sheer scale, will the BBC’s comprehensive content wiki be seen as another passing fancy by its critics, or will the Genome Project be able to put every piece of the broadcaster’s history (presumably with a 5-year buffer for new items) together in an intriguing and well-presented manner?