Social Network Analysis and Visualisation for #RDAPlenary 3

Twitter is a great source of information as well as a fantastic communication tool on specific event. In our – academic – world conferences are one of the most commonly occurring events, where a significant degree of communication between participants now happen on Twitter. Last week Dublin was hosting the Research Data Alliance Plenary 3 with over 400 participants. It was organised by Insight Centre and the DRI, as well as other partners from Australia, Europe and US. We were assisting the organising committee with the event’s social media, part of which we demonstrated the top tweeters for each day and also the top tweeters and the top retweeteds for the whole conference. In addition to this, I presented an analysis and visualisation of the conversation between #RDAPlenary tweeters. Here I am sharing with you how these were done, in case you find them interesting or wanted to do the same for your other events.

In this blog post I take you step by step through what I did for (1) computing top tweeters and retweeters, and (2) analysing and visualising the #RDAPlenary social network.

Tools used:

  • Twitter: You need a Twitter account
  • ScraperWiki: You need a ScraperWiki account
  • MS Excel – used only for top tweeter identification and not necessary for social network analysis
  • OpenRefine
  • Gephi

Getting #RDAPlenary tweets

The first step in this journey here is to download tweets from Twitter. Twitter doesn’t provide free access to all the tweets posted on Twitter. It, however, gives us access to a selection of tweets via the Twitter APIs and searching for specific hashtags, keywords or users. Access to all Twitter data, the Twitter firehose, is very much restricted and also expensive.

There are various ways to collect tweets. Programmers usually write their own scripts for this purpose, but here we use a very easy tool for this, which doesn’t require any programming skills and that’s the fabulous ScraperWiki.

1. Sign in to ScraperWiki.
2. Select ‘create a new dataset on the page you are directed to, called ‘data hub’.

Scraperwiki

Create a new dataset in ScraperWiki

3. Select ‘Search for Tweets’ to search for tweets you are interested in.

ScraperWiki-Search4tweets

Search for tweets in ScraperWiki

4. The next step is to tell ScraperWiki what you are interested in. Here you can use various advance search operators supported by Twitter. The easiest, however, is to search for a hashtag or a keyword. For our purpose we searched for #RDAPlenary to collect the conference tweets.

Search for #RDAPlenary

Search for #rdaplenary

You may be asked to authorise the ScraperWiki to access your Twitter account and then ScraperWiki will start downloading all tweets it can find and access. You can also tick a box for collecting future tweets matching the hashtag.

5. After a short while ScraperWiki will show you that the tweets are collected and ready for you. You can now see collected tweets on the ‘View in a Table’ tab.

View collected tweets in a table

View collected tweets in a table

6. You will now have the option to download the tweet dataset as a spreadsheet or a CSV file. Download as spreadsheet. You may also use the the  #RDAPlenary tweet dataset that I collected and used for RDA (#RDAPlenary tweets from 25 March 2014, 6.00 am.m. until 28 March 2014 3.00 p.m.).

Open the spreadsheet you just saved and have a look at the data. You will soon find yourself familiar with various columns in the spreadsheet, such as ‘created_at’, ‘text’, ‘screen_name’ and ‘hashtags’. Feels good?!

Top Tweeters of #RDAPlenary

During the conference we presented the top tweeters and the most retweeteds. For this I used MS Excel.

In the conference we were mainly focused on the original tweets and thus I deleted the retweets (starting with RT) from the datasets.

1. Select the ‘text’ column and select sort option as shown in the picture. This will sort the body of the text alphabetically and thus put all the tweets starting with RT together.

Sorting tweets to identify retweets

Sorting tweets to identify retweets

2. Select the ones starting with RT and delete them all.

Delete RTs

Delete RTs

Now we have a dataset of original tweets. Save it somewhere safe for further social network anlaysis, and use a copy of this clean dataset of original tweets for computing top tweeters in Excel.

We are now ready to calculate top tweeters.

3. Select all the data in the spreadsheet (Ctrl+A) and the select Pivot Table from Data tab. Pivot tables are a great way of summarising data in Excel.

Create Pivot Table

Create Pivot Table

This will open up a new work sheet for you with a guessed Pivot table. You don’t want that table and need to make changes to the pivot table and turn it to what you need.

4. What you want is to have the user handles in your first column and the count of each user’s tweets in front of their twitter handle. You should have a ‘Pivot Table Builder’ window automatically opened or you should be able to access it under Pivot Table>Data section. This builder helps you to choose the Row and column lables and values for your pivot table.

5. Scroll down the Field Name list and un-tick all the boxes, so that we can start fresh. You can also drag from Row Labels, Column Labels and value boxes and drop them in the big Field name box.

Now we have an empty pivot table.

6. Drag ‘screen name’ from Field name box to Row Labels box.

7. Drag ‘text’ from Field name box to Values box.

Pivot Table Builder

Pivot Table Builder

8. Click on the i on the right side of the dropped text under Value and make sure ‘Count’ is selected. This counts the tweets for each row, i.e. each Twitter handle.

Now you just need to sort your pivot table to see the top tweeter.

9. Select the values under Total column until the Grand total, leaving the Grand total value out. Then choose sort option.

Sort Pivot Table for top Tweeter

Sort Pivot Table for top Tweeter

 Yay! You now have the top tweeters here computed for you.

10. You can find out the top retweeters in the same manner. For this in your pivot table choose ‘retweet_count’ and choose ‘Sum’ function for it, and then sort again. This will give you the sum of retweets for each user.

2014-04-04 05.48.55 pm

Data manipulation using OpenRefine

Now we want to get ready for some exciting social network analysis and find out how much conversation has been going on during #RDAPlenary. I used Gephi for this, but before being able to send the data to Gephi for network analysis and visualisation, we need to transform data into a Gephi friendly shape. For this I used OpenRefine, which is a great tool for data cleansing and manipulation.

Extracting mentions using OpenRefine

What we need to do is to extract mentions from each tweet to be able to draw the conversation between people identified by their twitter handle (@mentions).

1.Open OpenRefine with OpenRefine default options.

2. Load your original cleaned dataset (the one you removed RTs from and saved to put aside) into OpenRefine using “create project”.

What we want now is to transform the data into a format that Gephi likes, and that is a format that has two columns: (1) user handle (screen name) and (2) the handle of the second person mentioned in the user’s tweet.

Ok, first thing we need to do is to extract mentions, start with @, from the tweets.

3. Before anything let’s unify mentions by lowercasing all the tweets’ texts. Select ‘edit cells>common transforms>to lowercase’ from the column options menu of the text column (the arrow next to the column name).

Transform Tweet texts to all lowercase

Transform Tweet texts to all lowercase

Great! Now we need to extract all the mentions from the tweet texts. For this we have to take several steps at once.

First we need to split all the words. Twitter seperates words by anything that is not a letter, number, dash (-), or underscore (_).

4. Select “edit columns – add column based on this column” and choose the column mentions. The expression that will split the column’s text into words is:

 value.split(/[^a-z0-9-_@#]/)
Add a column based on text column

Add a column based on text column

Next we need to filter the list that for taking the mentions out, i.e. everything that starts with a @.

To filter the list, we use the function filter(). The filter() function wants your list first, then the name of a variable to assign to each column, and then something that checks whether or not it should be included.

5. To filter for each person mention, we can amend the above expression as follows:

filter(value.split(/[^a-z0-9-_@#]/),i,i.startsWith("@"))
filter expression

Filter expression for filtering user mentions

6. We now need to join the list by appending the .join(“,”) function. This joins the list into a single string of text by inserting commas in case there is more than one mention in one tweet.

.join() expression

.join() expression

7. Now add this column to the dataset.

We now want to remove the columns that we don’t need, and bring the two columns ‘screen name’ and ‘mentions’ into position.

reorder

Re-order columns

8. Delete the columns that we don’t need and reposition ‘screen_name’ and ‘mention’ columns if required.

Re-order screen_name and mentions

Re-order screen_name and mentions

Great, now we have two columns left, justas we were wishing for: the user handle and a comma separated list on mentions in a specific tweet from that handle.

Repositioned columns

Repositioned columns

9. For Gephi, we’ll need to have each user-mention pair in one row. This we need to split the multi valued column into several rows. Do so by ‘edit cells>split multi-valued cells’, and split by comma (,).

Split by comma

Split by comma

Boom, done! Oh, wait, but we now have a number of rows with blank screen_name!

splited

Blank cells

10. Do not worry! Use ‘edit cells – fill down’ to add them to all the empty rows.

Fill down

Fill down

11. You now notice that there is a difference between the screen_names and the mentions: mentions start with @ (for users) and are in lowercase.

Let’s first lowercase the letters using “edit cells – common transforms – to lowercase”, as we did before.

12. Then use “edit cells – Transform” to add the @ in before the screen_name using “@”+value.

@screen_name

@screen_name

13. Viola! You have now formatted the file for Gephi. Download it as CSV (Comma-separated value) using the ‘Export’ button on the top right of your OpenRefine window.

Social Network Analysis using Gephi

We are now ready to start our social network analysis in Gephi.

1. Start Gephi and choose ‘new project’.

2. Open the CSV with ‘file>open’.

3. Select “directed” and leave the defaults.

Load #RDAPlenary dataset into Gephi

Load #RDAPlenary dataset into Gephi

Since Gephi takes all the rows and columns into account, we need to remove the column headers.

4. Change the view to the ‘data laboratory’ view.

5. Remove the first two nodes (screen_name and mentions): mark them, then right click and select ‘delete all’.

Delete column headers

Delete column headers

6. Change back to the ‘Overview’.

This  is a pretty ugly and unreadable graph! This is because we haven’t applied any layouts options.

Initial graph in Gephi

Initial #RDAPlenary graph in Gephi

7. It is easy: In the layout window on the left, select ‘ForceAtlas’. ForceAtlas is a simple algorithm, which groups connected nodes closer together.

forceatlas

ForceAtlas

Let ForceAtlas run for a while and stop it when you feel better about the graph.

8. But how do we know which node is who? Let’s enable the labels.

Enable lables

Show labels

Ok, we now have the labels, i.e. user handles, but they are not clear to read, and we don’t know which labels are more important than others.

Graph with labels

#RDAPlenary graph with labels

9. To scale the labels by how many connections a label has, select label size on the top right of the Ranking window; Then choose ‘degree’ as a parameter.

Choose a rank parameter

Choose a rank parameter

10. Click on ‘apply’ and play with the parameters (minimum and maximum size) as you see fit (make sure you apply every time when you make changes).

11. Cool, but we still can’t read a thing! Luckily there is a layout called ‘label adjust’. This layout will move nodes so the labels don’t overlap. Let it run for a while.

Graph after applying 'Label Adjust'

#RDAPlenary graph after applying ‘Label Adjust’

12.  For a final view switch to Preview tab and select refresh (make sure show label is selected). You can change the presets and Node values as you wish and refresh to see the result.

Previewing and exporting the graph

Previewing and exporting the graph

14. When you are happy with the result export your graph to your preferred format. Pretty, isn’t it?!

#REAPlenary Social Graph

#REAPlenary Social Graph

Here you can find the exact network visulisation I presented at the RDA Plenary 3 closing session and the complete #RDAPlenary tweet dataset that I used for it. More on RDA Plenary social media can also be found here.

 

This work was inspired by this School of Data work.

Social savvy journalists gather in NYC to discuss the future of UGC in News

ONE HUNDRED hours of video is now uploaded every minute. Soon there will be more user generated content than professionally produced media.  But what does this mean for the media industry? Storyful’s Mark Little joined a panel of expert social media journalists met at an event at AP offices in New York on Tuesday evening organised by the ONA to discuss how UGC will affect the media industry in 2014.

Mark Little is the CEO and founder of Storyful, a social media agency that specialises in gathering, verifying and disseminating UGG. Storyful was recently acquired by News International for a reported $25 million. The other experts on the panel were Eric Carvin, social media editor for the AP, Erik Martin, General Manager at reddit, Katie Rogers, Social News Editor, Guardian US and Julie Whitaker, Social Media Editor for WNYC radio.

Little predicted the  death of the propriety scoop and pointed out that reporters need to forget about ‘me first’ journalism and use their skills to turn content into stories.  He said that UGC is no longer a shiny new thing but is now a vital part of the news gathering process as he quipped, “TV anchors sitting in front of walls of tweets is so 2013”.

The panelists all agreed that we are in a golden era of journalism but discussed some issues that needed to be addressed including concerns over the safety of UGC creators. Carvin said he’s afraid the day will soon come where someone will get arrested, hurt or even killed because journalists asked them to go and gather content. “We have to be willing to recognise tweeting at someone could put them in more danger sometimes and not do it”, he added. Little noted that a whole new legal, ethical, financial framework is needed for user-generated content.

The importance of UGC was underlined as during the event panelists were distracted by news of more violence on the streets of Kiev breaking on social media as the audience of news addicts scrambled to follow #Kyiv on Twitter. Julie Whitaker was keen, however, to point out that UGC is not used for breaking news alone. She said that some of her radio channels best material came from UGC such as people remixing audio of NYC mayors and listeners creating a map of the best sledding spots in the city. All the panelists had entertaining examples of non breaking news stories from Storyful’s ‘Emotional baby’ vidoe (YouTube it) to reddits that rate people’s levels of marijuana inebriation.

The overwhelming message from the panel was that journalists still need the same skills as they always did and that UGC is just another source that should be treated no differently to any other tip. It needs to be verified independently. Carvin said the first thing he advises his journalists to do is to get the source out of ‘social’, by making a phone call or meeting in person. Erik Martin mentioned that this change is nothing new to journalism as he quipped that 50 years ago people were probably worried about ‘town hall generated content’ or ‘late night bar generated content’. Little added that social media just reflects what humans have been doing for centuries – talking about events, making jokes and interacting.

You can find some notes from the event here and the event curated Storify here.

Hacks/Hackers Workshop: How to ask questions & build great designs

As 2013 drew to a close, Hacks/Hackers Dublin held a pre-holiday meetup at the Irish Times building. This was the second meetup for the group, which was established in the summer of 2013, and it drew over 40 members to a team-building workshop facilitated by Justin Ferrell, journalist and Fellowships Director of Stanford University’s innovative d.school.

Dublin's new Hacks/Hackers at the beginning of Justin Ferrell's workshop

Dublin’s new Hacks/Hackers at the beginning of Justin Ferrell’s workshop

Ferrell is a charming and enthusiastic career innovator who has worked as a journalist for the Washington Post, and has designed several award-winning projects, including the investigative series Angler: The Cheney Vice Presidency, which won the 2008 Pulitzer Prize for National Reporting. His approach to design is ‘human-centred’, and the process he led us through explored what this means, in practice, for design.

As the hacks and hackers arrived in the room, we didn’t quite know what to expect – we just knew that Ferrell would be leading a workshop. Would we hear about his adventures in journalism? His epiphanies at Tiananmen Square? Would we have to stand up and actually say something as part of this workshop? The interest in Ferrell’s work, and in the newly forming chapter of Hacks/Hackers Dublin, was clearly evidenced by that fact that so many people showed up without knowing exactly what to expect.

Our design materials: back to primary school arts and crafts!

Our design materials: back to primary school arts and crafts!

In the end, Ferrell took us through a deceptively simple exercise: pair up with someone at your table, interview them about what they would like in a perfect wallet, and then design that wallet for them. We were given a pen and paper to jot down notes (like a good reporter), a selection of coloured cardstock, scissors, tape and some rubber bands; not exactly the tools of cutting-edge design expected by a generation raised on iThis and iThat.

However, the material aspect of the design quickly became a secondary concern, as Ferrell took us through the various stages of the exercise, which were mostly based on asking better questions. In the end, the exercise turned out to be a fun lesson in the art of listening, and learning how to ask questions that don’t initially present themselves as obvious.

Inspired Design.

Inspired Design.

In other words, Ferrell was teaching us how to be better communicators, and therefore better requirements analysts – skills useful for both hacks and hackers alike. Instead of focusing on the nuts and bolts of design, we were encouraged to allow our analysis and questions follow our partner’s stated needs, but also their more implicit desires. If a question about wallet function prompted a discussion about the family photographs contained in the wallet, we were asked to allow our next questions to follow that topic, instead of forcing an immediate return to the pragmatic applications. In this way, the session used the principles of free and open brainstorming to ultimately encourage more innovative outcomes (many pairs abandoned a traditional wallet altogether). It was also about allowing intuition to enter into the process of communication and collaboration, and allowing a conversation to go in directions not originally intended. By focusing on process over product, the argument seemed to go, one eventually ends up with a better product.

He made us stand up, after all.

He made us stand up, after all.

Overall, the workshop was an excellent team building exercise for our new Hacks/Hackers chapter – it served as an ice-breaker, as well as a helpful introduction to methods we could use when hacking new pathways into journalism.

If you are interested to know more about Justin’s Hacks/Hackers Dublin design workshop, we have a treat for you! You can now watch the workshop’s video on Vimeo, thanks to the Irish Times for recording it.

So… what’s next? Hacks/Hackers Dublin now has 130 members, and plans are currently in the works to hold a one- or two-day hackathon. If you’re interested in joining the chapter, or have ideas about projects to pursue, visit the group page at http://www.meetup.com/hacks-hackers-dublin/ and follow @HacksHackersDUB on Twitter.

Hacks/Hackers Dublin is a chapter of the international Hacks/Hackers organisation. The  chapter founder is Bahareh R. Heravi (Digital Humanities and Journalism group, Inisght Centre at NUI Galway) and the co-organisers of the chapter are Natalie Harrower (Digital Repository of Ireland, Royal Irish Academy), Gavin Sheridan (Storyful), Paul Michael Watson (Storyful) and Jarred McGinnis (Semantic News Technologist and the former Head of Research, Semantic Technologies at the Press Association). We would also like to thank  Johnny Ryan (The Irish Times) for co-organising and hosting this event.

Research Assistants vacancies for Social Semantic Archive Project

The National University of Ireland, Galway, is seeking two outstanding candidates for Research Assistant positions in the Social Web and Semantic Web / Linked Data domain. The successful candidates will work within the Digital Humanities and Journalism group at the Insight Centre for Data Analytics at NUI Galway (formerly known as DERI), and will also closely collaborate with the Digital Repository of Ireland at the Royal Irish Academy.

The successful candidate will conduct research in the area of the application of Linked Data technologies for creating a national social media repository, and will be focusing on topics such as social media event detection, social network analysis, Semantic Web and Linked Data, and the archival and contextualisation of social media content.

The Insight Centre for Data Analytics at NUI Galway hosts one of the most internationally-recognised Linked Data research groups in the world, and is dedicated to research aimed at enabling Networked Knowledge using Semantic Web technologies. Our project partner the Digital Repository of Ireland is a key national aggregator of cultural and social data.

Essential criteria for the applicant:

  • Bachelor’s degree in Computer Science, Informatics, or relevant subjects.
  • Strong Java or Python programming skills.
  • Minimum two years software development experience.
  • Knowledge of software design, development and maintenance processes.
  • Experience in some, if not all, of the following:
    • Social event detection
    • Social network analysis
    • Semantic Web and Linked Data technologies
    • Stream processing
  • Proven ability to work independently or in a team environment.
  • The applicant should be creative and enthusiastic, with excellent communication skills.

It is desirable that applicant will possess:

  • A Masters or MPhil degree in Computer Science, Informatics or relevant subjects.
  • A Masters degree in Humanities, Social Sciences, Journalism, Communication, Political Studies or relevant subjects.
  • Strong background in front- and back-end Web application development.
  • A good record of theoretical and applied research on the Social Web and the Semantic Web, warranted by a scientific publication record in workshops, conferences, journals and book chapters and/or awards in national or international challenges.
  • Industrial experience and the ability to collaborate with industry partners.
  • Experience in Web-related standardisation activities (including W3C) as both author and editor.
  • Excellent interpersonal communication and scientific writing skills.

Salary range: €24,478 – €32,173 per annum.

Position RA1 (ref NUIG 2-14) is available from 1st February 2013 and is full time for a fixed term of 12 months.

Position RA2 (ref NUIG 1-14) is available from 1st February 2013 and is part time (0.5 full time) for a fixed term of 12 months, or full time for a fixed term of 6 months.

For informal discussion about this post please contact: Bahareh.Heravi@deri.org

To apply: Applicants should include a cover letter, curriculum vitae, a list of publications, a research statement and the names and addresses of at least three referees, via email (text, postscript or PDF only) to: hr.ie@deri.org.

Please put reference number NUIG 2-14 for the full time and NUIG 1-14 for the part time position in subject line of e-mail application.

 

Closing date for receipt of applications is 5pm, 31 January 2014.

 

About NUI Galway 

The National University of Ireland, Galway (www.nuigalway.ie) is home to more than 15,000  students across five Colleges with highly active agendas in teaching and research.

The Insight Centre for Data Analytics (www.insight-centre.org) is a joint initiative between

researchers at NUI Galway, University College Dublin, University College Cork, and Dublin City University, as well as other partner institutions. It brings together a critical mass of more than 200 researchers from Ireland’s leading ICT centres to develop a new generation of data analytics technologies in a number of key application areas.

The €75m centre is funded by Science Foundation Ireland and a wide range of industry partners.  Insight’s research focus encompasses a broad range of data analytics technologies and  challenges, including machine learning and data mining, media analytics and optimisation, decision analytics, personalisation and recommender systems, the Semantic Web and Linked Data, and the sensor web. Together with more than 30 partner companies, Insight researchers are solving critical challenges in the areas of Connected Health and the Discovery Economy.

National University of Ireland, Galway is an equal opportunities employer. 

 

Research Assistants vacancies in Trust for Social Semantic Journalism

The Digital Humanities and Journalism group at the Insight National Centre for Data Analytics @NUI Galway (formerly know as DERI) is seeking  outstanding candidates for two full-time fixed term Research Assistant positions in trust and provenance in Social Semantic Web and journalism realm. This position is funded by Science Foundation Ireland and is available from 1st February 2014 to contract end date of 31st January 2015. 

The successful candidates will work on a project on ‘trust verification for social media content and sources for news production’, funded by the Science Foundation Ireland. The successful candidate will conduct research on the area of social event/breaking news detection and verification, social data provenance, Social Networks Analysis, Semantic Web and Linked Data. She/he will also be expected to play a strong development role in the project, and to collaborate with key national broadcasting organisations.

The Insight Centre for Data Analytics at NUI Galway (former DERI) hosts one of the most internationally-recognised Linked Data research groups in the world, and is dedicated to research aimed at enabling Networked Knowledge using Semantic Web technologies.

Essential criteria for the applicant:

  • Bachelor’s degree in Computer Science, Informatics, or relevant subjects.
  • Strong Java or Python programming skills.
  • Minimum two years software development experience.
  • Knowledge of software design, development and maintenance processes.
  • Strong background in front- and backend Web application development.
  • Have experience in some, if not all, of the following:
    • Social Network Analysis
    • Social Media Provenance
    • Semantic Web and Linked Data technologies
    • Stream processing
  • Proven ability to work independently or in a team environment.
  • The applicant should be creative and enthusiastic, with excellent communication skills.

It is desirable that applicant will possess:

  • A Masters or MPhil degree in Computer Science, Informatics or relevant subjects.
  • A Masters degree in Journalism, Communication, Political Studies or relevant subjects.
  • Familiarity with the news and media industry, knowledge of the news production process and preferably having worked in/with such organisation.
  • Experience in / familiarity with the following is highly desirable:
    • Digital/Online/Citizen Journalism
    • Natural Language Processing
    • Text Mining
    • Data Visualisation
    • Data Verification
  • A good record of theoretical and applied research on the Social Web and the Semantic Web, warranted by a scientific publication record in workshops, conferences, journals and book chapters and/or awards in national or international challenges.
  • Industrial experience and the ability to collaborate with industry partners.
  • Experience in Web-related standardisation activities (including W3C) as both author and editor.
  • Excellent interpersonal communication and scientific writing skills.

Salary: €24,592 – €31,083 per annum  (public sector pay policy rules pertaining to new entrants will apply).

Start date: 1st February 2014.

For informal discussion about this post please contact: Bahareh.Heravi@deri.org.

To Apply: Applicants should include a cover letter, curriculum vitae, a list of publications, a research statement and the names and addresses of at least three referees, via email (text, postscript or PDF only) to: hr.ie@deri.org.

Please put reference number NUIG 123-13 in subject line of e-mail application.

 

 Closing date for receipt of applications is 5pm, 19 January 2014. 

 

About NUI Galway

The National University of Ireland, Galway is home to more than 15,000 students across five Colleges with highly active agendas in teaching and research.

The Insight Centre for Data Analytics is a joint initiative between researchers at NUI Galway, University College Dublin, University College Cork, and Dublin City University, as well as other partner institutions. It brings together a critical mass of more than 200 researchers from Ireland’s leading ICT centres to develop a new generation of data analytics technologies in a number of key application areas.

The €75m centre is funded by Science Foundation Ireland and a wide range of industry partners.  Insight’s research focus encompasses a broad range of data analytics technologies and challenges, including machine learning & data mining, media analytics and optimisation and decision analytics, personalisation and recommender systems, the Semantic Web and Linked Data and the sensor web. And together with more than 30 partner companies Insight researchers are solving critical challenges in the areas of Connected Health and the Discovery Economy.

National University of Ireland, Galway is an equal opportunities employer.

 

The Future of News – Interviews from the Summit 2013

At the Summit 2013 HuJo caught up with some of the business leaders and entrepreneurs trying to transform the media industry. Click the links below to listen to our reporter Fergal Gallagher quizzing them on the future of news.

Jimmy MaymannInterview with Jimmy Maymann CEO of Huffington Post
Jimmy Maymann is a Danish entrepreneur who founded video marketing startup GoViral before selling it to AOL. He was appointed as CEO of The Huffington Post in 2012 where he works alongside Arianna Huffington who is now editor-in-chief.

 

Paul QuigleyPaul Quigley founder of NewsWhip
NewsWhip is a digital media start-up based in Dublin which tracks the most shared news stories across the social web.

 


Grainne Barron
Grainne Barron founder of ViddyAd
ViddyAd is a new startup that allows businesses make online video at a reasonable price. ViddyAd won the award for the best Irish start-up at the Summit.

 


david_tomchak-vizibee
David Tomchak Vizibee
Vizibee is another startup aimed at professional journalist which allows them to upload, broadcast and manage video content filmed from a mobile phone.

Pre-Christmas Hacks/Hackers meetup

Following on from our successful Hacks/Hackers launch, we are delighted to announce a second event in the run up to Christmas. Hacks/Hackers Dublin will host a workshop at The Irish Times building in Dublin featuring Pulitzer prize winning guest speaker Justin Ferrell.

Justin is the Director of Fellowships at Stanford University’s prestigious d.school. He was formerly  director of digital, mobile & new product design at the Washington Post, is an alumnus of the John S. Knight journalism fellowships at Stanford and designed the investigative series, “Angler: The Cheney Vice Presidency”, which won the 2008 Pulitzer Prize for National Reporting.

Justin is an expert in digital storytelling and promises to provide great insights on the evolving world of journalism. The free event is open to any journalists, programmers or IT professionals interested in the future of news. There will be plenty of opportunities for networking and we also hope to discuss plans for a Hackathon to be held in 2014.

The workshop will take place on Monday December 9th at 5:30 p.m. in The Irish Times building’s training room, on the corner of Townsend St. and Tara St. in Dublin.

We would like to thank The Irish Times for helping to make this event possible.

We look forward to seeing you all there!

Signup to attend

HuJo at the Dublin Web Summit

Dublin felt like the centre of the tech world this week as the razzmatazz of the Web Summit descended on the city. Over the course of the two days NASDAQ bell was rung, the Taoiseach (Irish Prime Minister) was driven on stage in an electric supercar by Elon Musk and the world’s most famous skateboarder, Tony Hawk, made an appearance in the RDS (Royal Dublin Society) at what is now the biggest tech conference in Europe and has to rank up there with the likes of SXSW in the US in terms of scale and HuJo was there to soak up all the excitement.

The atmosphere was electric from the moment we entered the main hall of the RDS on day one and the same buzz was maintained for the full conference by the constant din of more than 950 entrepreneurs trying to pitch their start up to media, venture capitalists and anyone who would listen. The space was expanded from Web Summit 2012 but with the attendance tripling to 10,000 this year’s conference felt a lot busier and more hectic.

Elon Musk and Enda Kenny with the Tesla car they arrived in. Photo by Conor McCabe Photography

Elon Musk and Enda Kenny with the Tesla car they arrived in. Photo by Conor McCabe Photography

Big Name Speakers

More than 300 speakers appeared in keynote addresses, panel debates and ‘fireside’ chats across five stages, including the impressive main stage in a huge auditorium that was back lit with what looked like an array of Rubik’s cubes gradually changing colours as the conference unfolded. Speakers included CEOs of now multinational companies like Phil Libin of Evernote, Jay Bregmann of Hailo and Aaron Levie of Box, to big name investors like Kevin Rose of Google Ventures and others from Atomico, Andreeson Horowitz as well as anomalies like 13-year-old Jordan Casey, who is already a serial app developer and entrepreneur. There was also a large digital media presence with speakers from Vine Media, The Huffington Post, TheNextWeb, RTE, The Guardian, and The Wall Street Journal among others. HuJo spoke to a number of them and will be publishing interviews over the coming weeks.

Start-up competitions

On top of the main stages the exhibition floor was littered with smaller stages where entrepreneurs were constantly making short pitches to panels of experts as part of a two day competition.  London start up Import.io, which attempts to structure the web more coherently to make data open to everyone won the PITCH later stage company award, New York based Placemeter, which automatically extracts data from live video streams too the early stage award, whilst ViddyAd, which allows companies to make professional videos at low cost online using stock content won the ‘Spark of Genius Award’ given to the best Irish start-up.

The conference had a real start-up feel to it, the entrepreneurs pitched in front of chipboard panels rather than gleaming screens and were playing every trick in the book to try and grab your attention. There was free smoothies, free coffee, exhibitors dressed as superheroes, women in lederhosen and there was even a gang of leprechauns constantly roaming the RDS. It wasn’t all about technology either, at lunchtime, in between rain showers, the 10,000 strong crowd migrated to nearby Herbert Park for the Food Summit where a huge tent housed food from some of Irelands best known restaurants and there was even a Movember (a charity that encourages men to grow moustaches for the month of November in aid of cancer) stand where you could get a free hot towel shave on the eve of a shaveless month.

Wi-Fi issues

The only complaint we heard from attendees was the poor quality of the Wi-Fi signal which left many exhibitors without a product to display. Organisers blamed the number of ‘tethered’ devices for the backlog and each start-up stand was given a hard line for the second day which seemed to alleviate the problem somewhat. But this didn’t dampen spirits as nearly every attendee we spoke to had made some sort of deal or at least made some new contacts over the two days and the selling went on right to the end up to the point where the Taoiseach, Enda Kenny pitched Ireland to Elon Musk live on stage as a potential European HQ for his Tesla electric car company. I attended the Web Summit last year and thought ‘it can’t get much bigger than this’ but after nearly tripling in size the question for its impressive organiser Paddy Cosgrave is where does the Web Summit go from here?

 

Main image courtesy of Dan Taylor/Heisenberg Media

Digital Humanities and Journalism Seminar

12.00 – 14.00, Wednesday 9 October, Moore Institute Seminar Room, NUI Galway.

On Wednesday October 9th, HuJo will host a lunchtime seminar  on Digital Humanities and Journalism in collaboration with the Moore Institute. The event will be hosted at the Moore Institute in NUI Galway and is the first of the Autumn/Winter series of the  Digital Scholarship Seminars. The aim of the seminar is  to showcase the work of INSIGHT @NUIGalway (formerly DERI) in Digital Humanities and Journalism, and to provide a vision of future collaborations between technology and humanities researchers.

Digital Humanities includes digitisation, preservation, access and discovery for cultural, historical, iconographic data and corpora. This data can range from contemporary film to digitised manuscripts. Helping archivists to index, link, and contextualise material, opens up our cultural legacy for personalised story-telling, new discoveries and connections.

Digital journalism can mean many things from print material published on the web to highly interactive forms of online journalism both sourced and published over the Internet via social media. The HuJo team is working on a number of Digital Journalism projects including the Social Semantic Journalism project, which aims to use semantic web technologies to help journalists use social media more effectively in the process of newsgathering and production.

Researchers at INSIGHT @NUIGalway are engaging with stakeholders and collaborators in the development of data analytic and social and semantic technologies to produce innovative solutions to real-world challenges with industrial and Cultural Institution partners.

The speakers will demonstrate how humanities researchers can benefit from collaboration with INSIGHT’s semantic web experts, citing examples from previous collaborations and potential future projects, as follows:

Professor Stefan Decker, Director of INSIGHT @NUIGalway
From Digital Enterprise to INSIGHT

DERI has been in the past focused on creating Network Knowledge – we believe this is still a promising technology for Digital Humanities, which can help with many research topics. In the next period INSIGHT, a national research centre combining the strength of 5 separate centres, is going beyond DERI by combining technologies from major ICT centres. In my presentation I give an overview over the capabilities in Galway, but also also will present what is available within the INSIGHT centre. INSIGHT is currently developing a Digital Humanities strand – the results from this workshop can influence which direction this will take.

Dr. Sandra Collins, Director of the Digital Repository of Ireland
Digital Repository of Ireland

The Digital Repository of Ireland (DRI) is an interactive national trusted digital repository for contemporary and historical, social and cultural data held by Irish institutions, providing online access, discovery and preservation. In addition to the national digital infrastructure, DRI works to raise awareness of the need and benefits of digital preservation and open access, while respecting ownership, rights, privacy and confidentiality. DRI seeks to share best practices with the community to enable cost savings and improved standards of preservation and access, and to inform national policy for digital preservation and access.

Dr. Bahareh Heravi, Team leader of Digital Humanities & Journalism at INSIGHT @NUIGalway
Future Newsrooms and Civic Journalism

The consumers of news and information are no longer passive and isolated consumers. Smart phones, digital cameras, mobile internet and social media platforms have made us all broadcasters of information. We consume information from traditional news sources, but also through social media platforms. We form communities to inform one another, we comment, we coordinate, and we disseminate. This ubiquity of new technologies has made it more likely than ever that an individual or a community, not a professional journalist, will be the initial source of information for a breaking news event. In my presentation I give an overview of how we see a future newsroom would look like and present ‘Social Semantic Journalism’ framework, which is aimed at assisting journalists in the process of newsgathering and production from social media.

Professor Siegfried Handschuh, Stream leader at INSIGHT @NUIGalway / Dr. Brian Davis, Research Associate at INSIGHT @NUIGalway
Quick and Dirty Examples of Text Analytics Applications for Digital Humanities

Open source tools for Text Analytics have matured to a point where they are now poised to deliver predictable and accurate cost-effect solutions. Furthermore, such tools have become more accessible and usable for non experts seeking to exploit such tools for their own research endeavours. This presentation will describe the capabilities of such tools with respect to Digital Humanities.  The presentation is aimed at non experts such as literary academics, scholars, journalists and and archivists who are interested in gaining a clearer understanding of how existing tools can be exploited.

Dr. Paul Buitelaar, Stream leader at INSIGHT @NUIGalway
Towards Extracting Author Networks from Secondary Literature

This work is concerned with a text mining application for the extraction of female author networks from secondary literature on English and Irish writers.

Using Semantic Similarity on Poetic Corpora

In this work we are concerned with the automatic comparison of poems by nineteenth-century authors Lord Byron and Thomas Moore. For this purpose a so-called ‘distributional semantics’ algorithm is trained on a data set of nineteenth-century poems in order to automatically indentify any significant semantic similarity in segments of a set of poems by above authors. The literary research question we investigate here is if any evidence for stylistic influence by Lord Byron on Thomas Moore can be established.


The event is free and open to public.

 

INSIGHT @NUIGalway (formerly DERI)

INSIGHT Galway (DERI) is a cutting-edge Semantic Web and Linked Data research centre and the largest and most reputable in the field worldwide. The INSIGHT Centre for Data Analytics is a joint initiative between researchers at NUI Galway, University College Dublin, University College Cork and Dublin City University, as well as other partner institutions. It brings together a critical mass of more than 200 researchers from Ireland’s leading ICT centres to develop a new generation of data analytics technologies in a number of key application areas.

INSIGHT’s Digital Humanities and Journalism group is a multidisciplinary effort aiming to bundle research and development activities around news and journalism, media, digital humanities and social sciences with social media and semantic web technologies, which are INSIGHT Galway’s core strengths. The groups collaborates media organisations and academic partners in the field to study, create, design, deploy, and assess tools and practices in Digital Humanities, media and journalism.

The event is free and open to public.

Image courtesy of infocux Technologies

Hacks/Hackers gets Dublin reboot

Hacks/Hackers Dublin was rebooted on Monday night with the first meeting taking place in the salubrious surroundings of the Library Bar at the Central Hotel on Exchequer Street in the heart of the city. There was an enthusiastic crowd of both Hacks (journalists) and Hackers (techies) and we were even joined by some international guests Andy Carvin of NPR who is famous for his unique methods of covering the news on Twitter and Amanda Michel of The Guardian (US).

Paul Watson (right), CTO of Storyful, addressing the crowd in the Library Bar.

Paul Watson (right), CTO of Storyful, addressing the crowd in the Library Bar.

The evening was kicked-off with a welcome talk from the group organiser, Dr. Bahareh Heravi, who heads up the Digital Humanities and Journalism group at Insight, NUI Galway. She then introduced Paul Watson, CTO of Storyful, who gave a brief presentation of the social news agency’s new ‘social search’ which is a Chrome plugin that allows users search for news across multiple social media platforms. Later attendees played an M&M game which encouraged everyone to offer up their ideas for the future of news and the Hacks/Hackers movement. Attendees were enthusiastic and came up with some great ideas, the creative juices seemingly lubricated by the alcohol from the bar.

Overall the evening was a great success. HuJo promises to hold another event before Christmas and by the end of the night had already received offers from a few parties to host a more extensive meeting. Thanks to everyone who attended and we hope to see you at the second event which is likely to be in early December.