Press "Enter" to skip to content

Parsing XML – Go Lang Practical Programming Tutorial p.11


what’s going on everybody welcome to

what’s going on everybody welcome to part 11 of the golang tutorial series in

part 11 of the golang tutorial series in

part 11 of the golang tutorial series in this part what we’re gonna be doing is

this part what we’re gonna be doing is

this part what we’re gonna be doing is learning how we can actually parse this

learning how we can actually parse this

learning how we can actually parse this XML document so just in case for the

XML document so just in case for the

XML document so just in case for the inevitability when the washington post

inevitability when the washington post

inevitability when the washington post the sitemap index basically here when it

the sitemap index basically here when it

the sitemap index basically here when it ever it happens to change or whatever

ever it happens to change or whatever

ever it happens to change or whatever something else goes wrong because it

something else goes wrong because it

something else goes wrong because it almost certainly will especially on a

almost certainly will especially on a

almost certainly will especially on a long enough time line this is the

long enough time line this is the

long enough time line this is the structure of it and then at the end I’ll

structure of it and then at the end I’ll

structure of it and then at the end I’ll kind of show you guys how you can

kind of show you guys how you can

kind of show you guys how you can convert that but just know that that’s

convert that but just know that that’s

convert that but just know that that’s the structure and you can either use

the structure and you can either use

the structure and you can either use your own sitemap index that you found

your own sitemap index that you found

your own sitemap index that you found from sitemap rather that you find from

from sitemap rather that you find from

from sitemap rather that you find from somewhere else or or you can you can

somewhere else or or you can you can

somewhere else or or you can you can convert this basically so if you want to

convert this basically so if you want to

convert this basically so if you want to if you want to be able to do that I’ll

if you want to be able to do that I’ll

if you want to be able to do that I’ll put a link or something to the

put a link or something to the

put a link or something to the text-based version in the tutorial so

text-based version in the tutorial so

text-based version in the tutorial so you can still follow along even if for

you can still follow along even if for

you can still follow along even if for whatever reason you can’t use the exact

whatever reason you can’t use the exact

whatever reason you can’t use the exact same one that we’re using but yeah so

same one that we’re using but yeah so

same one that we’re using but yeah so let’s go ahead and get started so the

let’s go ahead and get started so the

let’s go ahead and get started so the way that we’re going to do this is we’re

way that we’re going to do this is we’re

way that we’re going to do this is we’re going to use one more package and that’s

going to use one more package and that’s

going to use one more package and that’s the encoding slash HTML or XML package

the encoding slash HTML or XML package

the encoding slash HTML or XML package so that’s going to be encoding slash XML

so that’s going to be encoding slash XML

so that’s going to be encoding slash XML and we’re gonna use that to unmarshal

and we’re gonna use that to unmarshal

and we’re gonna use that to unmarshal into basically the structure that is at

into basically the structure that is at

into basically the structure that is at the XML structure so it’s gonna we’re

the XML structure so it’s gonna we’re

the XML structure so it’s gonna we’re gonna we could do this ourselves totally

gonna we could do this ourselves totally

gonna we could do this ourselves totally from scratch without using encoding ten

from scratch without using encoding ten

from scratch without using encoding ten XML or slash XML but that would be

XML or slash XML but that would be

XML or slash XML but that would be really tedious whereas this is kind of

really tedious whereas this is kind of

really tedious whereas this is kind of already built to accept it we just need

already built to accept it we just need

already built to accept it we just need to kind of give it the structure of the

to kind of give it the structure of the

to kind of give it the structure of the data that we’re trying to decode really

data that we’re trying to decode really

data that we’re trying to decode really so let’s go ahead and get started so the

so let’s go ahead and get started so the

so let’s go ahead and get started so the first thing that I’m gonna do is kind of

first thing that I’m gonna do is kind of

first thing that I’m gonna do is kind of clean this up we’re not anymore gonna

clean this up we’re not anymore gonna

clean this up we’re not anymore gonna use string body or print that line out

use string body or print that line out

use string body or print that line out so now what we want to do is we need to

so now what we want to do is we need to

so now what we want to do is we need to define the structure of this this XML

define the structure of this this XML

define the structure of this this XML document so first I’m going to do a type

document so first I’m going to do a type

document so first I’m going to do a type sitemap index that’s gonna be a struct

sitemap index that’s gonna be a struct

sitemap index that’s gonna be a struct and then inside it basically at the end

and then inside it basically at the end

and then inside it basically at the end of the day what do we want so we want

of the day what do we want so we want

of the day what do we want so we want capital locations as the value and it’s

capital locations as the value and it’s

capital locations as the value and it’s going to be an array of the location

going to be an array of the location

going to be an array of the location type which doesn’t yet exist and then we

type which doesn’t yet exist and then we

type which doesn’t yet exist and then we kind of

kind of

kind of scribe this is for when we go to

scribe this is for when we go to

scribe this is for when we go to unmarshal it the tag that it’s under so

unmarshal it the tag that it’s under so

unmarshal it the tag that it’s under so that’s XML : and then double quotes

that’s XML : and then double quotes

that’s XML : and then double quotes don’t forget those sitemap the other

don’t forget those sitemap the other

don’t forget those sitemap the other thing you don’t want to forget that’s a

thing you don’t want to forget that’s a

thing you don’t want to forget that’s a little less obvious in my opinion is

little less obvious in my opinion is

little less obvious in my opinion is that you must capitalize these values if

that you must capitalize these values if

that you must capitalize these values if you don’t capitalize these values they

you don’t capitalize these values they

you don’t capitalize these values they won’t be exported when you go to use two

won’t be exported when you go to use two

won’t be exported when you go to use two unmarshal it basically it’s gonna see

unmarshal it basically it’s gonna see

unmarshal it basically it’s gonna see that that’s really supposed to be like

that that’s really supposed to be like

that that’s really supposed to be like internal basically so it won’t export it

internal basically so it won’t export it

internal basically so it won’t export it you won’t get any values from it and

you won’t get any values from it and

you won’t get any values from it and that’s really annoying I got stuck on

that’s really annoying I got stuck on

that’s really annoying I got stuck on that for way too long

that for way too long

that for way too long that was annoying so anyways locations

that was annoying so anyways locations

that was annoying so anyways locations of location type and then what’s

of location type and then what’s

of location type and then what’s happening here basically it’s going to

happening here basically it’s going to

happening here basically it’s going to be and in this case a slice in slices

be and in this case a slice in slices

be and in this case a slice in slices basically let me just run through slices

basically let me just run through slices

basically let me just run through slices really quickly and erase basically

really quickly and erase basically

really quickly and erase basically anything that is you know square

anything that is you know square

anything that is you know square brackets with the number in it and then

brackets with the number in it and then

brackets with the number in it and then a type whatever that type happens to be

a type whatever that type happens to be

a type whatever that type happens to be that is that’s an array anything that

that is that’s an array anything that

that is that’s an array anything that doesn’t have a number in it and a type

doesn’t have a number in it and a type

doesn’t have a number in it and a type that’s a slice they’re pretty much the

that’s a slice they’re pretty much the

that’s a slice they’re pretty much the same thing the only difference is this

same thing the only difference is this

same thing the only difference is this is of a fixed size you could also have

is of a fixed size you could also have

is of a fixed size you could also have like a 5×5 for example that’s going to

like a 5×5 for example that’s going to

like a 5×5 for example that’s going to be an array this is gonna be a slice so

be an array this is gonna be a slice so

be an array this is gonna be a slice so for example 5×5 int that’s a you know

for example 5×5 int that’s a you know

for example 5×5 int that’s a you know 5×5 integer array whereas here this is

5×5 integer array whereas here this is

5×5 integer array whereas here this is just some sort of integer slice of some

just some sort of integer slice of some

just some sort of integer slice of some kind in our case here we’ve got

kind in our case here we’ve got

kind in our case here we’ve got locations it’s a slice of location types

locations it’s a slice of location types

locations it’s a slice of location types we don’t really know what those are yet

we don’t really know what those are yet

we don’t really know what those are yet and so we need to define those so what

and so we need to define those so what

and so we need to define those so what while we’re talking about it let’s go

while we’re talking about it let’s go

while we’re talking about it let’s go ahead and do that type location struct

ahead and do that type location struct

ahead and do that type location struct and here it’s gonna be the location

and here it’s gonna be the location

and here it’s gonna be the location again don’t forget it must be capital l

again don’t forget it must be capital l

again don’t forget it must be capital l OC string it’s gonna so that’s a string

OC string it’s gonna so that’s a string

OC string it’s gonna so that’s a string type and then where’s it located that’s

type and then where’s it located that’s

type and then where’s it located that’s gonna be XML under the Luke LOC tag

gonna be XML under the Luke LOC tag

gonna be XML under the Luke LOC tag obviously that must be lowercase because

obviously that must be lowercase because

obviously that must be lowercase because that’s the you know the tag itself is

that’s the you know the tag itself is

that’s the you know the tag itself is lower cased okay now what we can do is

lower cased okay now what we can do is

lower cased okay now what we can do is come down here and

come down here and

come down here and and we can do bar s and bar s is going

and we can do bar s and bar s is going

and we can do bar s and bar s is going to be a sitemap index type and now we

to be a sitemap index type and now we

to be a sitemap index type and now we can unmarshal into that so we’re gonna

can unmarshal into that so we’re gonna

can unmarshal into that so we’re gonna do XML dot capital u I’m Marshall and

do XML dot capital u I’m Marshall and

do XML dot capital u I’m Marshall and then where do we want to or what do we

then where do we want to or what do we

then where do we want to or what do we want to unmarshal that’s gonna be bytes

want to unmarshal that’s gonna be bytes

want to unmarshal that’s gonna be bytes and then where do we want to well we’re

and then where do we want to well we’re

and then where do we want to well we’re gonna unmarshal at the basically into

gonna unmarshal at the basically into

gonna unmarshal at the basically into the memory address of s so now that

the memory address of s so now that

the memory address of s so now that we’ve done that let’s go ahead and see

we’ve done that let’s go ahead and see

we’ve done that let’s go ahead and see what we’re looking at so we should be

what we’re looking at so we should be

what we’re looking at so we should be able to format dot print line s dot

able to format dot print line s dot

able to format dot print line s dot locations because that’s gonna be our

locations because that’s gonna be our

locations because that’s gonna be our basically our our slice of data so let’s

basically our our slice of data so let’s

basically our our slice of data so let’s go ahead and save that and run it and

go ahead and save that and run it and

go ahead and save that and run it and see how we’ve done go wrong go to okay

see how we’ve done go wrong go to okay

see how we’ve done go wrong go to okay so what we get here is pretty much like

so what we get here is pretty much like

so what we get here is pretty much like we expected and if you’re not you know

we expected and if you’re not you know

we expected and if you’re not you know if you’re not new to programming your

if you’re not new to programming your

if you’re not new to programming your pry some flags are going off but but

pry some flags are going off but but

pry some flags are going off but but anyways here are all the URLs so we’re

anyways here are all the URLs so we’re

anyways here are all the URLs so we’re very very close to what we wanted but it

very very close to what we wanted but it

very very close to what we wanted but it looks odd like we can see the brackets

looks odd like we can see the brackets

looks odd like we can see the brackets here which kind of denotes list or array

here which kind of denotes list or array

here which kind of denotes list or array or something which is like yeah that’s

or something which is like yeah that’s

or something which is like yeah that’s what we wanted but then we have like

what we wanted but then we have like

what we wanted but then we have like these curly braces well the reason why

these curly braces well the reason why

these curly braces well the reason why we have these curly braces is what we

we have these curly braces is what we

we have these curly braces is what we have here is it’s still basically it’s

have here is it’s still basically it’s

have here is it’s still basically it’s not a string yet like so

not a string yet like so

not a string yet like so so the sitemap index so like of this

so the sitemap index so like of this

so the sitemap index so like of this type yes it’s got a location slice and

type yes it’s got a location slice and

type yes it’s got a location slice and yeah the location itself is a string but

yeah the location itself is a string but

yeah the location itself is a string but we actually need to have a string method

we actually need to have a string method

we actually need to have a string method that’s gonna apply to this so we’ve

that’s gonna apply to this so we’ve

that’s gonna apply to this so we’ve actually already talked about methods

actually already talked about methods

actually already talked about methods and all that so this is relatively

and all that so this is relatively

and all that so this is relatively simple but in this case if you have a

simple but in this case if you have a

simple but in this case if you have a string method what are we trying to do

string method what are we trying to do

string method what are we trying to do are we trying to actually modify

are we trying to actually modify

are we trying to actually modify anything within the the struct or are we

anything within the the struct or are we

anything within the the struct or are we just trying to get some values out of it

just trying to get some values out of it

just trying to get some values out of it well we’re just trying to get some

well we’re just trying to get some

well we’re just trying to get some values out of it so in this case we can

values out of it so in this case we can

values out of it so in this case we can use a value receiver so let’s go ahead

use a value receiver so let’s go ahead

use a value receiver so let’s go ahead and func

and func

and func and then we’re gonna do L for a location

and then we’re gonna do L for a location

and then we’re gonna do L for a location type that was an underscore L that looks

type that was an underscore L that looks

type that was an underscore L that looks kind of weird and sublime but anyways

kind of weird and sublime but anyways

kind of weird and sublime but anyways and then it’s a string with a capital S

and then it’s a string with a capital S

and then it’s a string with a capital S of string type which is what it’s going

of string type which is what it’s going

of string type which is what it’s going to return and then it’s just going to

to return and then it’s just going to

to return and then it’s just going to return

return

return a format dot s print F L dot location

a format dot s print F L dot location

a format dot s print F L dot location and save that and then mistaken let’s

and save that and then mistaken let’s

and save that and then mistaken let’s just rerun it real quick right okay so

just rerun it real quick right okay so

just rerun it real quick right okay so now that we’ve given it a string method

now that we’ve given it a string method

now that we’ve given it a string method it actually has strings lo and behold we

it actually has strings lo and behold we

it actually has strings lo and behold we actually have string URLs

actually have string URLs

actually have string URLs also let me just pull up the s printf

also let me just pull up the s printf

also let me just pull up the s printf here there you go anyway it basically

here there you go anyway it basically

here there you go anyway it basically it’s just gonna format that it does what

it’s just gonna format that it does what

it’s just gonna format that it does what it says formats according to a format

it says formats according to a format

it says formats according to a format specifier and returns the resulting

specifier and returns the resulting

specifier and returns the resulting string basically you’re gonna use that

string basically you’re gonna use that

string basically you’re gonna use that pretty much every time you will have a

pretty much every time you will have a

pretty much every time you will have a string method if you want to convert

string method if you want to convert

string method if you want to convert some sort of struct thing to a string

some sort of struct thing to a string

some sort of struct thing to a string this would be the way you’re going to

this would be the way you’re going to

this would be the way you’re going to use it to be honest I’ve not really seen

use it to be honest I’ve not really seen

use it to be honest I’ve not really seen any other reason you would use s Peart

any other reason you would use s Peart

any other reason you would use s Peart if that’s the only time I’m sure there

if that’s the only time I’m sure there

if that’s the only time I’m sure there are more I’ve not been in golang for a

are more I’ve not been in golang for a

are more I’ve not been in golang for a really long time but that’s appears to

really long time but that’s appears to

really long time but that’s appears to me to be the the main use okay so now

me to be the the main use okay so now

me to be the the main use okay so now that we’ve made it that far we’ve got a

that we’ve made it that far we’ve got a

that we’ve made it that far we’ve got a it’s a slice but I’m gonna probably call

it’s a slice but I’m gonna probably call

it’s a slice but I’m gonna probably call it a list a few times but and that’s

it a list a few times but and that’s

it a list a few times but and that’s what it looks like to me it’s a list of

what it looks like to me it’s a list of

what it looks like to me it’s a list of stuff right but it’s definitely a slice

stuff right but it’s definitely a slice

stuff right but it’s definitely a slice there’s no comma there so I guess we

there’s no comma there so I guess we

there’s no comma there so I guess we could call it not a list anyway what we

could call it not a list anyway what we

could call it not a list anyway what we need to do now is iterate through these

need to do now is iterate through these

need to do now is iterate through these values and get those URLs and then visit

values and get those URLs and then visit

values and get those URLs and then visit those URLs and because those are site

those URLs and because those are site

those URLs and because those are site maps get the URLs and maybe titles or

maps get the URLs and maybe titles or

maps get the URLs and maybe titles or something from those site maps and so on

something from those site maps and so on

something from those site maps and so on so that’s what we’re doing the next

so that’s what we’re doing the next

so that’s what we’re doing the next tutorial obviously we need to learn how

tutorial obviously we need to learn how

tutorial obviously we need to learn how to actually loop over this list first

to actually loop over this list first

to actually loop over this list first that’s what we’re gonna be talking about

that’s what we’re gonna be talking about

that’s what we’re gonna be talking about loops next the other thing I want to

loops next the other thing I want to

loops next the other thing I want to show you guys real quick for the end of

show you guys real quick for the end of

show you guys real quick for the end of the tutorial

the tutorial

the tutorial if you if for whatever reason you can’t

if you if for whatever reason you can’t

if you if for whatever reason you can’t access the Washington Post site map and

access the Washington Post site map and

access the Washington Post site map and you still kind of wanted to follow along

you still kind of wanted to follow along

you still kind of wanted to follow along here is what you could do you could save

here is what you could do you could save

here is what you could do you could save our wash post XML equals slice of bytes

our wash post XML equals slice of bytes

our wash post XML equals slice of bytes bite and then it’s going to be

bite and then it’s going to be

bite and then it’s going to be a multi-line and paste boom done

a multi-line and paste boom done

a multi-line and paste boom done let’s go ahead and move that underneath

let’s go ahead and move that underneath

let’s go ahead and move that underneath the import just to make it right

the import just to make it right

the import just to make it right and then basically a byte so you could

and then basically a byte so you could

and then basically a byte so you could say bytes each equals wash post I don’t

say bytes each equals wash post I don’t

say bytes each equals wash post I don’t think it’s what we want to like wash

think it’s what we want to like wash

think it’s what we want to like wash post XML yeah get rid of this unmarshal

post XML yeah get rid of this unmarshal

post XML yeah get rid of this unmarshal bytes we’ve probably got some import

bytes we’ve probably got some import

bytes we’ve probably got some import that we don’t need but let’s just run it

that we don’t need but let’s just run it

that we don’t need but let’s just run it really quickly to find bytes and I you

really quickly to find bytes and I you

really quickly to find bytes and I you util bytes equals that’s kind of Oh :

util bytes equals that’s kind of Oh :

util bytes equals that’s kind of Oh : equals and then it was IO util that we

equals and then it was IO util that we

equals and then it was IO util that we didn’t need so I can just remove that

didn’t need so I can just remove that

didn’t need so I can just remove that real quick bring this back up run again

real quick bring this back up run again

real quick bring this back up run again oh come on just please work this time I

oh come on just please work this time I

oh come on just please work this time I have time for this I just wanted to show

have time for this I just wanted to show

have time for this I just wanted to show you guys real quickly okay and then

you guys real quickly okay and then

you guys real quickly okay and then that’s probably gonna get angry at us

that’s probably gonna get angry at us

that’s probably gonna get angry at us for using HTTP there we go okay so just

for using HTTP there we go okay so just

for using HTTP there we go okay so just obviously it’s short so I use the

obviously it’s short so I use the

obviously it’s short so I use the shorter XML but that’s how you can just

shorter XML but that’s how you can just

shorter XML but that’s how you can just still follow along if and when this the

still follow along if and when this the

still follow along if and when this the sitemap goes away also it’d be kind of

sitemap goes away also it’d be kind of

sitemap goes away also it’d be kind of nice because you can come in here and

nice because you can come in here and

nice because you can come in here and you can maybe add to your new tags kind

you can maybe add to your new tags kind

you can maybe add to your new tags kind of play around with it you know I

of play around with it you know I

of play around with it you know I encourage you to try your own sitemap

encourage you to try your own sitemap

encourage you to try your own sitemap index try to figure out how to build the

index try to figure out how to build the

index try to figure out how to build the struck sand all that because that’s not

struck sand all that because that’s not

struck sand all that because that’s not the most intuitive thing ever in my

the most intuitive thing ever in my

the most intuitive thing ever in my opinion but anyway that’s all for now in

opinion but anyway that’s all for now in

opinion but anyway that’s all for now in the next tutorial we’ll talk about

the next tutorial we’ll talk about

the next tutorial we’ll talk about looping because we want to be able to

looping because we want to be able to

looping because we want to be able to loop over that list if you have

loop over that list if you have

loop over that list if you have questions comments concerns whatever

questions comments concerns whatever

questions comments concerns whatever feel free to leave them below otherwise

feel free to leave them below otherwise

feel free to leave them below otherwise I will see you in the next go tutorial

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *