Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions fri/holding_algorithms_accountable.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
### Holding algorithms accountable

*Print outs of recent Tow Center report upstairs*

**Question of responsibility**

* Correlation does not equal causation
* Correlation does not equal intent (with regard to algoritms - maybe the designer did not intend for this to happen)
* Understanding the design process of building algorithms can help understand why they function in the way they do


**A couple of different ways that journalists can approach algorithms:**

* Why are they important? (explainers etc)
* Pick apart algorithms and expose flaws - how might an algoritm be discriminatory
* Does it break a law? Does it make us feel uneasy? If so should be questioned

37 changes: 37 additions & 0 deletions fri/introduction_to_r.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
### Introduction to R

##### Sharon Machlis, Computerworld

<http://bit.ly/RGuideIntro>
<http://bit.ly/Rresources>

R project for statistical computing - great for sophisticated data analysis, but also useful for basic and intermediate work eg. grouping, plotting, exploratory data visualisations. Works primarily on the command line

**When to use R over Excel:**

* Amazing community and lots of add-on packages
* Writing scripts - reproducable research - check, share, reproduce again and again. Create script once and use again and again

**Tour of R**

* Console - where you type in scripts
* Top right - history of all commands you have run
* Environment tab - as you store variables, they will appear here
* Bottom right - where visualisations will appear, where you can view add on packages, help files etc
* When you run equations, it will show the number of your results in brackets
* First command: library(lattice)
* install.packages
* data() - loads sample datasets
* plot(melanoma) - loads a chart of melanoma data in the bottom right window
* Up arrow cycles through the history of commands
* Create our own function:
* regline <- lm(melanoma$incidence ~ melanoma$year)
* lm: linear model
* abline: trend line
* <- is equals

* Store variables eg. x <- 5
* C function - concatenate
* Starts at 1 not 0, unlike other computer languages

When searching for column figures, always remember to add comma after, eg. melanoma[5,] Shows everything in column 5
122 changes: 122 additions & 0 deletions fri/lightning_talks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
### Lightning talks

####Refactoring - why your code sucks and how to fix it
**@onyxfish**

*"Code read and modified much more quickly than written"*

What is refactoring? - improving code quality without adding features

**Follow the code smells:**

1. Duplicated code - if it's there twice it's wrong
2. Loooong functions
3. Inconsistent style - don't copy and paste!

Develop good habits by refactoring, otherwise your code is like word clouds - bad!

---

#### A few of my favourite wee things
**@lenagroeger**

* Small multiples - sequences of small graphics
* Tiny text - highlight difference in size, outliers
* Tiny art
* Mini maps - can give more context for a story - little map next to a large map
* Inline pictures in text
* Mini graphics - sparklines, can now be put in tweets. Icons eg. navigation icons - noun project. Tiny states (State Face)

---

#### Natural language processing in the kitchen
**@anthonyjpesce**

* Text dump of LA Times full of recipes
* python >>> import nltk
* Feed trainer words, parts of speech, trigrams
* Turn horrible text file into structured data
* Good if you know in advance what you are looking for
* Blog post: <http://lat.ms/nltk> Slides: <http://lat.ms/nlpslides>

---

#### Five algorithms in five minutes
**@chasedavis**

* All code: <http://github.com/cjdd3b/nicar2014>
* Loops: slow code - vectorisation
* Naive Bayes - solves classification problems
* Iterative algorithms
* Vantage point trees - for fuzzy search. Solves problem of adults who can't spell
* Latent Dirichlet Allocation - topic modelling algo.

---

#### What can we learn from terrible data viz
**@katiepark**

* Not everything needs a chart!
* Watch your scale
* Know your data types
* Double check your information, then do it again!
* Don't overdo it! Simple is often best

---

#### Calculus for journalism
**@dataeditor**

* Journalists should do maths! Creative discipline
* Calculus = change
* Compound interest - can compound at different rates
* Chemical leak - spouting cylinder
* Riemann sums - calculate the area under a curve

---

**@sisiwei**

---

#### The whole internet in 5 minutes
**@jeremybowers**

[Slides](https://docs.google.com/presentation/d/1h41aj_hg-8Y0cotOjSIEOBBoPUxIuU_Ol45jbFhVqlY/edit#slide=id.p)

[Text](https://gist.github.com/jeremyjbowers/9279751)

---

#### How to raise an army
<http://tylerjfisher.com/nicar2014>

* How do we make Knight Lab a community of webmakers?
* Community: Brown bag lunches
* learn.knightlab.com - self guided curriculum for digital literacy
* Open lab hours - people come and work on things
* Lessons learned - kill imposter syndrome newsnerdfirsts.tumblr.com
* Communication is the key

---

#### You must learn!

*Five lessons from the history of data viz*

* Nothing we are doing is new
* Seriously, nothing we are doing is new
* It is a wild world out there
* You've already been scooped by a computer

[Ben Welsh's slides](https://docs.google.com/presentation/d/1f9RJO8-6pxJn1LWvVmWaLPuVzklxiTSJL5PScl3NcPI/edit#slide=id.g2b056d122_07)










26 changes: 26 additions & 0 deletions fri/maps_with_leaflet_and_mapbox.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
### Maps with leaflet and mapbox

1. Get data from Texas open data site. Add FIPS number

2. Open shapefile in qgis

3. For all of your columns tell qgis your relevant columns are strings then save as a csvt - save in same folder:

eg. "String","String","String","String","String","String","String","Real","String","String","String"

4. Joins: Join shapefile and data on FIPS number. Now attribute table should show both

5. Save as geojson

**All session notes and example files can be found here:**
<https://github.com/brickaa/mapbox-leaflet-demo>

* Possible to style geojson files with leaflet, but it works really well with Mapbox
* example1.html
* On the command line, navigate to your folder
* One line of Python to show in browser
* example2.html - added styles and div for legend, set zoom level etc
* Add grid layers to show interactive elements
* example.html - added marker


42 changes: 42 additions & 0 deletions sat/crossing_the_language_boundaries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
###Crossing the language boundaries

* Email sucks - better to use a ticket system such as Jira

* Have developers in your editorial meeting

* Make sure you leave enough time for testing news apps

* Is it something you might want to reuse or repurpose?

* What is the absolute minimum set of things it is worth launching - minimum viable product? Can you launch more features as time goes on?

* CMS Inflexible, but inflexibility can be an opportunity as CMS rely on clear workflow - Gateway drug to programming? Key to understanding logic

* Participating in hackathons - solutions to problems: 2:1 devs to reporters

* Source: learning sessions - Matt Waite series on journalism ethics

* Just because it's a big story doesn't mean it should be a big application. Just because the journalist thinks the data is important, it doesn't mean it should be a big data application

* Developers should avoid pushing for unrealistic deadlines from journalists because the reporting process takes a long time

* Always remember to fact check the app!!

* Very difficult to be a very good developer and a very good reporter simultaneously - very different types of disciplines, but good to have enough of an understanding to be able to *communicate* easily between each other

* Learning lunches - talking about things, putting in context, talk about what is and is not easy to do, easy to scrape vs hard to scrape etc. Create a space for everyone to ask questions that there is no other good time to ask - [link](github.com/veltman/learninglunches)

* Static vs dynamic:
* Static - Actual file sitting somewhere, pre-wrapped food for you to take
* Dynamic - food in a restaurant, assembled on the fly from a db becayse you asked for it, file does not actually exist


* Static site generator, benefits of a static site, but a database behind the scenes, so our users feel like they are using a vending machine. Speedier than a static site

* Workflow - a lot of backward, sideways movement on the reporting side

* Who edits data applications? Fuzzy line between reporters and developers. Reporters would love to have great interactives whilst they are writing so they could learn from them

* What can be done, should be done and what you want to do. Always remember to go back to the mission - get information to the public

* Check your metrics - why do your editors want these apps? Will people actually use them?
77 changes: 77 additions & 0 deletions sat/it's_not_just_for_looks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@

### It's not just for looks:
Presentation as a storytelling tool

*Moderator: Chrys Wu: @MacDiva*

[Slides](http://j.mp/nicar14)

####Helene Sears: @MateerS

BBC online workflow:

1. Define: Getting the who, what, when, where of the projct down on paper. Figure out who has a right to an opinion, weekly checkin schedule etc. Write a headline of what you want to create
2. Brainstorm: All hands on deck, as many ideas as you can come up with, quantity not quality. Designers to understand editorial process. Journos to try not to come with a sketch of what you expect and be disappointed if it doesn't come off. No one party should dominate. Keep pushing on ideas, first idea you come up with probably not the best one.
3. Wireframe and prototype: Test frequently and early, sanity check your ideas to make sure they make sense to people not involved in the project. Set up questions you will ask in user testing. Distil information to find out which parts are relevant. Use proto.io and paper prototypes
4. Refine and protect: Mobile first starting with the smallest screen. Conisistent, optimised user experience. Accessibility should be key
5. Deliver: More and more designer/developers at the BBC. Testing on a whole stack of devices, every browser

[BBC newsgraphics website](http://bbc.co.uk/newsgraphics)

Twitter: @bbcnewsgraphics


---

####Aron Pilhofer: @Pilhofer

NYT: Work completely outside of the content management system Everything we do is collaborative

Would describe the team as a "product development team"

Newsrooms very good at doing single pane graphics, but not so good at devising big new ways

What's the story? What's the headline?

**Oscar coverage:**

Old model:

* Feature creep - a huge problem. User comes to site and can't figure out what to do in the first 7 seconds

New model:

* Get rid of all tabs - tabs are LAZY. Design becomes much cleaner
* Twitter coverage not just on a hashtag
* Presenters referenced live blog and live blog referenced them
* Deprioritised the ballot because it was only important to a small number of users (analytics)
* Event tracker to check what readers are doing
* Do readers use the tools in the way that we expect? Answer: yes

Need to improve on accessability testing

---

#### Alyson at NPR @Alykat

**Planet Money makes a T-shirt**

* Eight people on the team: Two designer/developers, 3 interactive producers, project manager, journos
* Brainstorming - started with the process story, storyboarded everything
* Visual style guife for all photographers to help get a consistent look. Cheat sheet made and specified down to the lenses that would be used, technical specification
* Inspiration - Serengeti Lion project from National Geographic
* Wireframed structure - visually driven with a text and graphic component
* Broke down into chapters. How do we break this up within chapters?
* Amazing visuals of the process so lead with video.
* Conversational voice guides you through
* Get more indepth - scripts for video and text below written mindfully of each other
* Design decisions around the interface were deliberate
* If it doesn't work on mobile it doesn't work
* No autoplaying video on mobile
* Reach out on instagram to ask people to share pictures of themselves wearing the tshirt
* User testing is so important: watched as people went through the site - significant changes occurred after this





29 changes: 29 additions & 0 deletions sat/scraping_data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
### Scraping data

**When to scrape:**

*When no one has the data you need*

* Eg to answer the question - how many adoptions have taken place internationally where the children are subsequently rehomed. Scraping a messageboard
* Down them all - download manager, plugin for FF

*When they won't give you the data you need and you don't have time for FOIA*

* The Second Mile - child abuse case. Database of Penn. state police - sex offenders register
* Firebug - get under the hood of the website. Chrome in network tab, run search on the website. 'POST' request to the server, parameters that it is feeding to the servers
* Used free tool called 'ie unit' - quick focus interface as you navigate through the page it goes into the page and figures out what js is running as you click through. Write script then run it and it will download the data
* Helium scraper - basic version of software for $99. 10 day trial, point and click tool

*When there is no one to ask or you don't want them to know you are doing it*

* Counterfeit pharma coming out of China

*When you want regular updates*

* Crime numbers - good for police reporters, provide context in your reporting

*Cautions:*

* How do you know you got them all?
* Look for the hidden treasure - 'download dataset'
* Involve the lawyers, be mindful of legal implications
3 changes: 3 additions & 0 deletions sat/threat_modelling_for_journalists.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
### Threat modelling for journalists

[Link to slides](http://www.scribd.com/doc/209968137/Threat-Modeling-Planning-Digital-Security-for-your-Story)
Loading