fetch

13.6: XML and JSON - Processing Tutorial



Sharing buttons:

okay in this video I want to look at

getting data in a standardized format

that we haven't seen before so we've

seen raw text come into processing we

chopped it up into words and we counted

those words we've seen tabular data

which is really wonderful is like oh I

found all these numbers and I want I

want to use these numbers and look at

this they're just sitting right there

separated by commas it's a very easy

format I know how to parse it and I've

got all these numbers so this is what

we're hoping for in life that there's

data that we want we found the data and

it's in a standardized format meaning I

can read this format in particular not

me but a computer program can read this

format very easily in the next video or

some video after this one we'll look at

data that comes in and anon you know

just kind of SMS what do we do if I have

to manually figure out like how is this

data structured and organized and it's

not really meant for me to parse there's

still ways we can we can solve this

problem but in this video we're going to

live in a happy place where the data is

either in XML and you see what I've

written here

extensible markup language I believe

that stands for or JSON these are two

formats I want to look at JavaScript

object notation so these are standards

that have been developed in order for

applications to serve up data and this

is what we think of as an API and

application programming interface a

interface for two different applications

to talk to each other one might be a web

server like the New York Times web

server wants to talk to my processing

sketch oh how exciting how do they talk

to each other by sending data back and

forth in some standardized way that they

each know about XML JSON so one of the

things we'll see is that ah we have this

example with the table we made these

bubbles that had an X and the wine

diameter and a label we're going to see

exactly these a these we're see exactly

this example duplicate an XML and JSON

but uh more likely what the scenario you

might be in is the following and we go

unfortunately wearing a green shirt

today which I I need to attach the

microphone to something but you can sort

of see through me so this might be your

scenario you found something online an

API it's free I can get the data I want

to vision

whether information open weather map and

if we read through this page we're going

to find lots of documentation for how to

get a particular piece of data I've I've

already grabbed some for you

this is now and you can see this is what

we need to sort of start also getting

used to this idea of query strings so

looking here we see there's a URL you

know google.com amazon.com open weather

map API org and but what we're going to

have to figure out in processing is how

do we request the data with some

specificity I don't just want all the

weather of all of the world I want the

weather in London I want the data to be

an XML I want my units to be metric and

I want maybe seven seven days worth of

data I believe so we have to get used to

this Q equals London and mode equals XML

these name value pairs that make up a

request to a particular API but we're

going to see that in a little bit later

but it's good to kind of get that into

our minds right now and now you can

start to see oh look here's data I have

weather data it's in a location the name

of that location is London the country

is Great Britain here's my locate some

more longitude and latitude and altitude

and all sorts of other information so

this looks there's something that's

similar to a table here in that there's

chunks of data in between these tags so

the interesting thing about XML and also

JSON is the data is stored in a tree

structure which actually will give us a

great deal more flexibility and power in

terms of organizing our data than simply

a table would so let's think about let's

use whether I'm making this up on the

fly hopefully it's going to be okay so

let's say I'm asking for weather

information and weather is the root of

my tree and then maybe I have a location

which is a child of the weather which is

London and then maybe I have five

children like Monday Tuesday Wednesday

Thursday Friday and each one of those

has a temperature and a high and a low

like a current temperature a high and a

low you could I've got it going on and

on but you can see this is how you might

think of the data this is a bit more

flexible than just having to have a flat

data

there's only columns and rows here we

can sort of think of a database of

objects there might be another city

that's coming in you know New York and

that has a bunch of days of weather and

each one of those days also has a

temperature in high and low so this is

how the data is structured in a tree now

what is this actually look like

so I want to pull up if you remember so

I don't have all of this kind of just

like ready to go on the fly here but I'm

going to go into our examples and I'm

going to look at wereld we have load

save table this is an example we were

looking at in the previous one of the

previous videos and we have this table

each one of these things has a label as

an X Y in diameter and we can kind of

imagine that that table is something

that's quite a film quite comfortable

for us to sort of think of this data in

a table XY diameter name

XY diameter name let's look at exactly

this data in XML

so let's come back here where was I had

oh I was there I was there a second ago

chapter 18 let's look at load save XML

and I'm going to go into the data folder

and we grab this piece of data and do I

have sublime text somewhere no let's

just open it in let's try that yeah and

make a bigger bigger bigger come on here

we go here's the data

oh we move it to the side practice makes

perfect right okay so you can see here

there is a root node bubbles bubbles has

four children a bubble a bubble a bubble

a bubble each bubble has three children

a position a diameter label a position

in diameter later look the position has

two attributes an x and y the diameter

has a piece of content this label the

label has a piece of content this sad so

this you know this is how the data is

structured if there is a tree each

element of the tree has an open tag like

weather and a close tag

backslash whether each child then is

inside of this particular node like the

city might be London so city is a child

now the city might actually have other

children inside of here in which case I

might put the end tag down here and then

put some other children did the actual

where the line breaks are don't actually

matter that's just for us the human

being to be able to see it so as we

start to look at data online you're

going to start you're going to find the

data like this and it becomes detective

work for you you have to figure out okay

I whether is the route then their city

and I want then the Tuesday a high

temperature so what is the path to that

it's the child of whether the child of

the the first child of whether and the

second child of that child and then the

high attribute so if we go to an actual

example where this happens like this is

actually using Yahoo Yahoo weather

instead of open weather map and I run

this let's just run this to see that it

works we can see here I'm getting

today's high oh I'm not here I'm getting

today's high is 72 degrees in this zip

code which is the zip code I'm standing

in right now and the forecast is partly

cloudy how did I get that come back here

we can look into the example you can say

okay first of all this is kind of key

look at this URL I have a URL which is

requesting that XML weather data plus

zip plus a variable P equals zip so

let's just take this for a second and

put this into the browser and you can

see that's the URL I'm going to but

there needs to be an argument P equals

what I could get the weather in where I

am right now one zero zero zero three

that's if code and you can see here's

all of that XML data coming in I could

change this to

nine oh two one oh and here's the

weather in Beverly Hills California all

that weather is coming in so while if

I'm in the browser I'm just typing this

stuff up into the URL query address area

thing but in processing itself I need to

form that URL as a string and I can

concatenate two strings together with

the plus operator so this is a very

simplistic example if I were to pull up

I have another example which is loading

from New York Times you can see here

there's a bit more stuff going on I'm

searching for processing in the newest

article I'm using an article search here

there's a query there's a sort order I

also need to have an API key so I have

to form that URL how did I figure out

how this URL is formed I'm simply doing

that by going to New York Times website

and reading through its documentation so

there's no like catch-all scenario here

I'm just kind of showing you all the

bits and pieces but you will have to do

that detective work yourself so back to

this weather example how are we getting

that stuff load XML so we saw load

strings gives us a txt file load table

we can load a CSV any type of tabular

data and now we have load XML which is

assuming that that whatever these the

query is a file the URL worked what

we're getting is XML data and now XML

get child Channel item Y weather

forecast get the integer for the high

temperature get the string for the text

so this is how I search into a

particular piece of XML data and if I go

if I go back to the browser and we're

looking at this you can see okay what

was it it was oops it was channel item Y

weather forecast look at this channel is

first I already forgot let me move this

over to the side here

oh this was going so well Channel item Y

weather forecast okay channel item

where's I

somebody find me item there it is down

at the bottom here item why weather

condition is there a why weather

forecast there it is

why weather forecast and here that's the

information that I wish for so I've got

a botching this because you know belt

I'll try to read you this video that's a

voice but but you get the idea so you

you load a URL you load it into load XML

and you find that piece of data you're

looking for in addition to our URL we

could see here that sorry load save XML

this particular example now I can run

this example which is just loading in an

XML data file this is the tabular data

with bubbles and look at this here we go

I want to get give me the position and

give me the X and give me the Y then

give me the diameter so this is the

syntax much like how we look through a

table

we looked at every single row and every

single column now we have to look at

every single child and in each child let

me pull out that x value let me put out

that Y value and the same way that in

table we could then save that table data

back out we can also save XML data back

out and you can see that's happening

right here

save XML data slash XML okay

so it's kind of like a haphazard a very

scattered description of XML and looking

at a few scenarios here is now let's

look think about JSON so JSON JavaScript

object notation now if we'd been

programming all along in JavaScript

you'd be like yeah we're done javascript

object notation is just the syntax of

JavaScript and in JavaScript you can

make an object like this I'm going to

save our particle equals open curly

bracket close curly bracket you know X

is at 100 why is that 200 and the

diameter is 52 so this is a line of code

in JavaScript which is declaring an

object literal an object literal

a particle that has an excellent

diameter and we've kind of seen this in

processing class particle float X float

Y float D so this is kind of like

instead of a template this is just

making the object itself what's

interesting about JavaScript object

notation is if you put this into a text

file this is exactly the syntax for

storing that data so if I come back over

here and I were to go to let me find

load save JSON and I'm going to open

this JSON file we can see now

oh I'm not over here I'm really screwing

this up we can now see that this is that

same data in JSON format there is

something called bubbles what is that

it's an array that square bracket means

an array with a bunch of things in it

each one of those things separated by

commas has a position which itself is an

object with an X and a Y and a diameter

which has something in a label which is

a string so this is now a standardized

format for that particular data and if I

look into the code we can see what's

happening okay I need to load that JSON

file what's in that JSON file an array

of bubble objects for each array give me

the give me the object in each one of

those arrays and then give me the

position give me the X give me the Y

give me the diameter so you know I'm not

going through like the the nitty-gritty

details of this syntax I think that if

you looked at all three of those

examples next to each other and all

three data formats you would start to

see how are things organized columns and

rows XML children JavaScript JSON arrays

which is a list of things and JSON

objects which is a collection of

properties with a name like position and

a value like 12 comma 13 and x and a y

so this is how we're working with data

and I think I'm kind of wrapping this up

here but I think it might be useful just

to see a JSON example so this by the way

okay so here is that New York Times

example which which goes to the New York

Times API searches for the word

processing sorts it by

and then we get all of this JSON back so

you can see here this is kind of a mess

this is very while this is very easy for

a computer program to read this is very

hard for us to see one thing that you

can do that I think is useful to see I'm

going to take this copy paste it there's

a site JSON for matter that I can just

paste all this like garbled JavaScript

of JSON in and I'm going to hit process

and it formats it nicely for me so now

if I look at this full screen I can

start to like see like ah okay

so where is that there's a response and

it has some Doc's each has a URL oh

there's a headline main Australia issues

blanket visa honorable I should search

for something different but you can see

here's that headline and if I go into

processing and go to this New York Times

API example and I run this you can see

ah I have that exact same headline now

showing up and processing how did I do

that and you can see here that what did

I do I said okay I got all that JSON

data then I look for the response then I

looked for docs and then I looked for

the first element index 0 and then I

looked for the headline and then I

looked for main so what we that

detective work essentially that we did

by figuring out that it's a response

Docs then there's an array a the first

one that I need headline that I need

main that is mirrored here in the in

this processing sketch by looking

through all these pieces of JSON ok so I

don't know how useful this was because

it was kind of a smattering of lots of

things and I like one video with both

XML and JSON and but you can see so what

I would say to you what's your exercise

now find a data source online I will try

to link to a whole bunch of examples of

things see if you can pull that data

into processing and try to just pull out

a singular unit of information a

temperature article headline and then

you know once you've done that you might

start looking for oh can I get larger

amounts of day

can I get how many times from the New

York Times a certain word appears every

year from you know 1950 to 2014 and draw

a graph of that so so this is some

beginning steps and I will say goodbye

and I will hear from you someday when

about this particular video goodbye