1 #!/usr/bin/env perl 2 use 5.14.1; 3 use warnings; 4 use utf8; 5 use open qw/:std :utf8/; 6
(. ./env.sh; perl ./unroll.pl 1524045502714728449
Boston Perl Mongers TONIGHT Tuesday, May 10, 2022 7:30 PM to 9:30 PM EDT (Every 2nd Tuesday of the month) (on JITSI)
Using Perl's Twitter::API
The useful wrapper of Twitter's API.
using only pp
for pretty-printing debug dumps of nested HoAoH structures.
Everyone's favorite error handler, needed as remote API's can choke up.
To allow an if DEBUG
symbol
47 48 use Twitter::API; 49 use Data::Dump qw/pp/; 50 use Twitter::API::Util 'is_twitter_api_error'; 51 use Try::Tiny; 52 use constant { DEBUG=>0, } ; 53
Before using any twitter API application you must use OATH to authorize API access with your account.
See oauth_desktop.pl in the Twitter::API
distribution for details.
A permanent application should create it's own application key, but test apps may use the Twitter::API
module's own.
The application can read the four security parameters from a protected config file or process environment. This demo reads from ENV.
69 70 my $client = Twitter::API->new_with_traits( 71 traits => [ qw/ApiMethods RateLimiting DecodeHtmlEntities NormalizeBooleans/ ], 72 consumer_key => $ENV{CONSUMER_KEY}, 73 consumer_secret => $ENV{CONSUMER_SECRET}, 74 access_token => $ENV{ACCESS_TOKEN}, 75 access_token_secret => $ENV{ACCESS_TOKEN_SECRET}, 76 ); 77 78 my $r = $client->verify_credentials; 79 # say "$$r{screen_name} is authorized"; 80 81 my $mentions = $client->mentions; 82 # for my $status ( @$mentions ) { 83 my $status; 84
The sole command-line argument is a Twitter status number, which is the =last tweet in a chain from which to unroll backwards.
The imediate purpose for this unroller was to have a non-caching, not-tracking unroll of a historical project's thread on #ordainedslavery, Mass Bay Puritan preachers who owned human beings.
The 101st entry in the thread is the default start in this script https://twiter.com/elevennames/status/1509876985744355329
As a bonus in addition to doing an unroll, this script will also take a heuristic attempt to make a Town index, so it collects tweets in a reversing list @Keepers
and an HoA %Towns
.
$id
is the tweet to next process, starting with the starting point (tail) from argument or default.
103 104 my $id= shift @ARGV // 1509876985744355329; # Latest end of thread, should be a parameter 105 106 my @Keepers; 107 my %Towns; 108
Loop logic is simple, continue looking up $id
and chaining until at begining.
114 115 while ($id) { 116 117 try { 118 $status = $client->show_status($id, { cache=>'none', tweet_mode=>'extended' } ); 119 say ref $status if DEBUG; 120 } 121 catch { 122 die $_ unless is_twitter_api_error($_); 123 124 # The error object includes plenty of information 125 say $_->http_request->as_string; 126 say $_->http_response->as_string; 127 say 'No use retrying right away' if $_->is_permanent_error; 128 if ( $_->is_token_error ) { 129 say "There's something wrong with this token." 130 } 131 if ( $_->twitter_error_code == 326 ) { 132 say "Oops! Twitter thinks you're spam bot!"; 133 } 134 135 }; 136 137 # say $status->{user}->{screen_name}, q(: ), $status->{full_text}; 138
$s->{full_text}
is the message body needed.
Heuristically grab serial number, names of prelate, town from the tweet.
Hash-tags that apply to the whole series are skipped but otherise likely indicate the town.
User mentions are likely a Historical Society account, and indicate a town.
This heuristic section is tuned to the specific use and would be greatly simplified for generic use! For use on a conversational thread, would want to capture user names (handle and/or display), but since purpose was unrolling soliloquoy thread, that isn't done here.
154 155 # What to save 156 my @Temp = ($status->{full_text}); 157 158 # grab post number, and lead name if possible. 159 # NOT case-insensitive to avoid needing stop-words 160 my ($num, $reverend) = ($status->{full_text}) =~ / (?: ^ | \s) (\d+(?: [.][0-9]+)? ) [.]? \s+ ((?: Rev[erend]*[.]? \s* )? (?: Mr[.]? \s*)? (?: [[:upper:]][[:word:]]+ \s* )+ )? /xsm; 161 $num //= q(??); 162 $reverend //= q(); 163 say pp($status) if $num eq q(??) or (! defined $reverend and $num !~ /\d+[.]\d+/); 164 165 if ($status->{entities}->{hashtags}) { 166 my @tags = grep {not $_ =~ m/slavery/ } 167 map {$_->{text}} 168 $status->{entities}->{hashtags}->@* ; 169 unshift $Towns{$_}->@*, "$num. $reverend" for @tags; 170 171 } 172 if ( scalar $status->{entities}->{user_mentions}->@* ){ 173 push $Towns{$_}->@*, "$num. $reverend" for map {$_->{screen_name}} $status->{entities}->{user_mentions}->@* 174 175 } 176
Annoyingly, pictures ("media") and links ("urls") are in two different forks of the nested HoAo? structure.
182 183 # Need to use extended_entities to see > 1 photo. 184 # Expanded URLs all have /1, not /1 .. /4, so need media_url instead. 185 186 187 if ($status->{extended_entities}->{media}){ 188 # say "Media ", $_->{media_url} for $status->{extended_entities}->{media}->@* ; 189 push @Temp, $_->{media_url} for $status->{extended_entities}->{media}->@* ; 190 } 191 192 # But links are ok in plain entities. 193 if ($status->{entities}->{urls}){ 194 # say "Link ", $_->{expanded_url} for $status->{entities}->{urls}->@* ; 195 push @Temp, $_->{expanded_url} for $status->{entities}->{urls}->@* ; 196 } 197
Re-establish our loop invariant.
(Debug code here will dump the tail tweet and bail on the loop; good for debugging heurisic collection.)
207 208 # loop chaining 209 $id = $status->{in_reply_to_status_id} // undef; 210 say "PREVIOUS $id" if DEBUG; 211 212 say pp($status) if DEBUG; 213 214 last if DEBUG; 215
by unshift
inging onto @Keepers
, the list of tweets is reversed as its collected. (If we push
d, we'd have to do a pop
loop or explicit reverse
.)
Effect is as if we'd done unshift @Keepers, [ "full text", "url", ...];
224 225 unshift @Keepers, \@Temp; 226 } # While 227
First output is the %Town
index, which is produced in sorted order.
This uses the modern postfix-deref notation.
235 236 # Give town index 237 for my $town (sort keys %Towns ) { 238 my $aref = $Towns{$town}; 239 say "$town: ",join(q(, ), $aref->@* ); 240 241 } 242 243
And finally, print the message thread, in original sequence, full text with saved media links.
250 251 # Give full list in order 252 # say pp \@Keepers; 253 say "\n\n"; 254 255 for my $kept (@Keepers){ 256 say $_ for $kept->@*; 257 say ""; 258 }