00:00:03part two of the Python reddit API
00:00:05wrapper or prawn tutorial mini-series in
00:00:08this tutorial what were we talking about
00:00:10is at least beginning to parse comments
00:00:13so like I said but at the end of the
00:00:15last video comments represent a
00:00:16different kind of challenge for a
00:00:18variety of reasons mainly it’s just the
00:00:20fact that comments aren’t you know
00:00:22perfectly in order there it’s a tree of
00:00:25data it’s not a linear form of data so
00:00:29anyways I’m going to go ahead and remove
00:00:32a subreddit that subscribe but the rest
00:00:34of this stuff can remain so just
00:00:37underneath this let’s go ahead and
00:00:38continue so the first thing we could do
00:00:41is first of all I want to limit this to
00:00:43there’s there there are two stickies so
00:00:47I’m just going to limit this to three
00:00:48just so we don’t go you know so we just
00:00:51do one submission for now and now I’m
00:00:54going to come down here and we can
00:00:58reference the comments by just saying
00:01:00comments equals submission dot
00:01:03comments so this gives us the comments
00:01:06so now we can do is to say for comment
00:01:09in comments we can go ahead let’s go
00:01:13ahead let’s like print 20 times this but
00:01:18let give us some separation and then
00:01:20what we’re going to do is we’re going to
00:01:21print comment but just like a submission
00:01:24the comments are like these objects like
00:01:27the perot object and the object is just
00:01:31going to have the ID so then you
00:01:33reference an attribute and one of the
00:01:34attributes is body for the body of that
00:01:37comment and then what we’re going to say
00:01:40is so that’s our that’s our comment so
00:01:42we can at least iterate through comments
00:01:44that way so for example let’s just run
00:01:45that real quick this here your shirt
00:01:51here so these are like all our you know
00:01:55comments now let me pull up that what
00:02:01they’re just close out of it I guess I
00:02:04closed out of it
00:02:06[Music]
00:02:07pull over mine so that was why so
00:02:12there’s six comments total here but some
00:02:15of these are like replies like for
00:02:17example if you’re unfamiliar do yourself
00:02:19a favor and look into pandas so for
00:02:22example if you made me look for this
00:02:26army okay okay anyway it’s not here okay
00:02:31so what we have to do is iterate through
00:02:33it at least I’m pretty sure it’s not
00:02:34there so these would be just like top
00:02:37levels I’m pretty sure I just want to be
00:02:45a hundred percent sorry for wasting your
00:02:47time anyway so I think I closed again I
00:02:51cus I’m bad at closing things anyway I’m
00:02:54pretty sure it’s not there so what we
00:02:55need to do is get the replies so now we
00:02:58could say you know for reply so for or
00:03:06rather prot what we should do is we
00:03:08there might not be any replies so then
00:03:11we could say if lend comment dot replies
00:03:16is greater than zero and again if you
00:03:18didn’t know replies existed you could
00:03:20have done a Durer on comment Abadi or
00:03:22you can read the documents anyway
00:03:25if when comment our replies is greater
00:03:27than zero so we have some replies then
00:03:28order is a for reply in comment dot
00:03:31replies hmm we get loops that’s not a
00:03:35that’s a thank you anyway
00:03:38we can print then let’s just say like
00:03:41for blog that’s why and also we got body
00:03:48on that
00:03:55okay so here you get a it’s just me
00:03:58reply really great high-quality reply
00:04:02yeah okay so oh and here’s another reply
00:04:08I was like this really isn’t another one
00:04:09yet so this is the this is that comma I
00:04:11just searched for a second ago so there
00:04:13we caught that reply about pandas but
00:04:16then I think I close this let me open it
00:04:20again
00:04:20someone complained I wanted my videos
00:04:22like I just murder my Enter key it’s
00:04:25true uh okay if you’re there you go so
00:04:31so pan is looking to pandas but then
00:04:32there’s another comment underneath that
00:04:34right so then we would have to be like
00:04:36um you know we did we’d have to just
00:04:40basically okay and then at this plant
00:04:41reply we could say okay if when reply
00:04:44dot replies is greater than there but
00:04:46you have no idea how deep down the
00:04:48rabbit hole the comment tree things go
00:04:50right so that’s that’s slightly
00:04:52problematic then so the solution is we
00:04:57can actually say submission comments we
00:05:03can add dot lists to these and this will
00:05:06list out your all of the comments so dot
00:05:10list I believe is purely a Python reddit
00:05:14API wrapper so purely a prof. um ssin
00:05:17allottee that’s not something that’s
00:05:18actually available to you in the Python
00:05:20alright it’s not something that’s
00:05:22actually available to you even the
00:05:23reddit API but anyways that doesn’t
00:05:25matter
00:05:26let me go ahead and close this so we’ve
00:05:27got a nice clean thing and then also we
00:05:31uh we kind of want to do like print
00:05:34comment body we don’t really want to do
00:05:36the replies so let’s just do that to
00:05:41cancel this real quick
00:05:46so in this case we’ve run through all of
00:05:50them so here you go here’s a the
00:05:53second-level reply now unfortunately we
00:05:56have no absolutely no idea the
00:05:57contextual data for this like we don’t
00:05:59really know where this this was in the
00:06:02whole thing so for example you know you
00:06:05wouldn’t really know that this was in
00:06:07reply to you know which reply it was to
00:06:10now
00:06:11what list does is basically it takes all
00:06:13the top-level comments list those out
00:06:16then it goes down to the second level
00:06:18comments lists all those out then third
00:06:20level and so on so one option you have
00:06:22is rather than comment body what you
00:06:25could say is you can also grab like you
00:06:28could you can grab a print the parent ID
00:06:34and that would be comment dot parent now
00:06:39do you note that’s not an attribute
00:06:40that’s an actual new API call which in
00:06:44my opinion is super unfortunate I wish
00:06:46that was supplied and I don’t think
00:06:48that’s a mistake I believe that’s that’s
00:06:52just in reddit and I realize not every
00:06:54comment is going to necessarily have a
00:06:55parent but pretty much every comment
00:06:57would write like you know the parent is
00:06:59the actual submission or the parent is
00:07:01another comment so and these are like
00:07:04little tiny ID strings like I really
00:07:07think that should be included but it’s
00:07:08not it’s a new API call
00:07:10so anyway comment ID so comment that
00:07:14parent and rather than that this one is
00:07:15just comment ID which just is actually
00:07:18an attribute so huh crazy I can’t
00:07:23remember if a submission I’m pretty sure
00:07:25like the submission contains the
00:07:27subreddit ID so I love to give wrong
00:07:31though anyway that’s okay so now what we
00:07:34could do is get the parent ID in the
00:07:35comment idea of every comment and then
00:07:39what we could do is print the comment
00:07:40body
00:07:45and then you’ve got the parent ID in the
00:07:46comments idea of everything now from
00:07:50that point you could begin to do some
00:07:53pretty cool stuff but the first thing I
00:07:55want to show you is right let’s say
00:07:57let’s say we don’t do Python and instead
00:07:59we do news so very very popular
00:08:01subreddit and if this doesn’t work I’ll
00:08:03do like politics or something but we
00:08:06should hit an error here let’s go there
00:08:13we go here we go there’s error so if you
00:08:16use the dot list and you actually do
00:08:18iterate through all comments chances are
00:08:20eventually you’re going to wind up with
00:08:22this stupid error so more comments
00:08:25object has no attribute parent ok so
00:08:28what’s happening there is like on really
00:08:31long comment chains so like for example
00:08:34let me go to the news subreddit that
00:08:40would be this one marijuana company buys
00:08:42entire US time to create cannabis from
00:08:44the municipality that’s going to have
00:08:47lots of comments so for example right
00:08:49away you can see here this like load
00:08:53more comments that’s a more comments
00:08:56object and actually even though red it
00:08:59looks super simple they’re going to that
00:09:00till you click this I’m pretty sure
00:09:01you’re making a new call like it’s an
00:09:04actual call to their database same thing
00:09:07would like continue this thread that’s a
00:09:08new call it’s going to reload that data
00:09:10like all this data is not loaded on your
00:09:12page load that would be nuts you never
00:09:14load the page so anyways if you wanted
00:09:17to continue iterating through those
00:09:19comments you would need to also either
00:09:21handle with a you know an exception or
00:09:23something like that or one option you
00:09:25have is to replace the mores so for
00:09:29example coming down here comments that
00:09:31list one option you have is so you could
00:09:41you can just use dot replace more kind
00:09:43of starting to add a little too many um
00:09:45a little too many things here but let’s
00:09:50just do
00:09:52I’ll do I’ll add the dot list down here
00:09:54and then what we’ll say is dot replace
00:09:59underscore more and then for now we’ll
00:10:02say limit equals zero but at some point
00:10:04you will run into limits with the
00:10:06replace more like there’s only so many
00:10:07more it will add I think it’s 30 or
00:10:09something like that which is so fond of
00:10:12comments because like each replace more
00:10:14will load in a bunch of comments but
00:10:16just keep that in mind like you’re you
00:10:17you’re going to run out eventually
00:10:20but it won’t air if you do run out of
00:10:21the option to continue replacing instead
00:10:24it’s just going to toss them so you
00:10:25won’t hit an actual error anymore so
00:10:27anyways let’s let’s go ahead and run
00:10:28this real quick and probably I should
00:10:30remove the parent call that’s going to
00:10:32slow me down
00:10:35Walt hmm let’s see submission dot
00:10:42comments okay replace more hmm
00:10:46okay fine fine fine one one okay dot
00:10:48list and then we’ll come over here
00:10:50comments that replace more okay so first
00:10:53we we’ve converted it to list form which
00:10:55then creates this more comment object
00:10:58and now we can replace them I just did
00:11:00it backwards this should work that’s
00:11:03still going to be a lot of queries to
00:11:04the API but hopefully we’ll get through
00:11:06it are you kidding me please what have I
00:11:14done what have I done
00:11:17comments dot replace more so comment
00:11:19equals submission that comments
00:11:25I think I had it right the first time so
00:11:32comments equals submission comments
00:11:39please
00:11:41so where is a submission comments that
00:11:47replace more limit equals 0 now for
00:11:53comment in comments let’s see no.4
00:11:58comment in submission comments I really
00:12:04feel like I should have been able to
00:12:05string that someone can comment below
00:12:07what the fix should have been because I
00:12:09don’t see why I wasn’t able to string
00:12:11those together but obviously messing up
00:12:12something so for comment in submission
00:12:15comments that list let me try that drink
00:12:21some more coffee 1 Matic there we go
00:12:27not a problem that’s going forever
00:12:30though I’m going to go I’m just going to
00:12:31break that pencil
00:12:34API calls eventually it would probably
00:12:36throttle me anyway as you can see now
00:12:39we’ve got all the parent IDs the comment
00:12:40IDs everything’s hunky-dory we’re doing
00:12:42great so go ahead and close this out so
00:12:47so that’s how you can iterate through
00:12:50all the comments and all that now now
00:12:55the question is you know how might you
00:12:57rebuild that comment tree right because
00:13:00at some point right like you’ve got to
00:13:02rebuild that tree so for example one
00:13:05option you could have is like build a
00:13:08dictionary or something like that and
00:13:10then each of like the you know like the
00:13:12parent you’ve got a parent ID and then
00:13:16the parent content and then all the
00:13:19replies so a parent ID content all
00:13:21replies parent ID kind of all replies
00:13:23and if you did that you could rebuild
00:13:25the tree yourself now I’m not going to
00:13:27go ahead and go through all that I don’t
00:13:29really see too much point covering that
00:13:31in video but if you are interested in
00:13:33that you can go to part 2 of this
00:13:35tutorial series on Python programming
00:13:38and there’ll be an example there if
00:13:39you’re interested in truly rebuilding
00:13:41those comment trees that’s one way you
00:13:43could do it that’s how I would do it
00:13:45anyway if you have a better way I’m sure
00:13:47somebody could come up with a better way
00:13:49anyways so now in the next tutorial
00:13:53we’re going to talk about is basically
00:13:56just streaming from reddit so this has
00:13:59all been like historical grabbing from
00:14:01reddit but there’s also a way you can
00:14:03actually just stream data from reddit so
00:14:05anyways that’s all going to be doing in
00:14:06the next tutorial if you’ve got
00:14:07questions comments concerns whatever
00:14:09feel free to them below otherwise I will
00:14:10see you in the next trip
”