A proto_net is a list containing two data frames named edges and nodes.
as_proto_net( tweet_df, target_class = c("user", "hashtag", "url", "media"), all_status_data = FALSE, all_user_data = FALSE, as_tibble = tweetio_as_tibble(), ... ) # S3 method for data.frame as_proto_net( tweet_df, target_class = c("user", "hashtag", "url", "media"), all_status_data = FALSE, all_user_data = FALSE, as_tibble = tweetio_as_tibble(), ... ) # S3 method for data.table as_proto_net( tweet_df, target_class = c("user", "hashtag", "url", "media"), all_status_data = FALSE, all_user_data = FALSE, as_tibble = tweetio_as_tibble(), ... )
| tweet_df | A data frame of tweets, as obtained by  | 
|---|---|
| target_class | 
 | 
| all_status_data | 
 | 
| all_user_data | 
 | 
| as_tibble | 
 | 
| ... | Arguments passed to or from other methods. | 
In a proto_net, users are always to source/from/tail side of dyads. target_class
defaults to "user", which creates edges where users on both sides of dyads.
However users can also share edges with "hashtag"s, "url"s, or "media", so
those values are also valid to provide to target_class to create two-mode/bipartite
proto_nets.
The edges of a proto_net represent the statuses that form each tie, and status-specific
columns are attached to the edges.
Casing
Twitter hashtags are not case-sensitive, so if target_class is "hashtag" they will
be cast to lower-case so they can represent the same node in downstream tasks.
URLs (after the domain) can be case-sensitive, so they are left as-is.
If you decide to cast URLs to lower-case yourself, exercise caution with Twitter's media URLs, as they contain a case-sensitive hash.
all_status_data and all_user_data default to FALSE as they can be performance
bottlenecks for large data sets, but they provide a way of building richly decorated
networks with the maximum amount of attribute data embedded in the graph structure.
path_to_tweet_file <- example_tweet_file() tweet_df <- read_tweets(path_to_tweet_file) tweet_df %>% as_proto_net(as_tibble = TRUE)#> $edges #> # A tibble: 1,234 x 4 #> from to status_id relation #> <chr> <chr> <chr> <chr> #> 1 194250838 340309688 1178007813257388032 retweet #> 2 825459487821619201 966825602 1178007817426370560 retweet #> 3 1116228559616397312 3153145782 1178007830034448384 retweet #> 4 4374655520 4167284315 1178007830005080064 retweet #> 5 1172885625068036102 1112877891841343488 1178007838418984960 retweet #> 6 1132474594583928832 1217220278 1178007842592149505 retweet #> 7 20737729 39334221 1178007846790692864 retweet #> 8 2198859787 3096758526 1178007846782287878 retweet #> 9 3237877098 847885187325276164 1178007851001737217 retweet #> 10 868219009350676481 1962155772 1178007859403132928 retweet #> # … with 1,224 more rows #> #> $nodes #> # A tibble: 1,228 x 1 #> name #> <chr> #> 1 194250838 #> 2 825459487821619201 #> 3 1116228559616397312 #> 4 4374655520 #> 5 1172885625068036102 #> 6 1132474594583928832 #> 7 20737729 #> 8 2198859787 #> 9 3237877098 #> 10 868219009350676481 #> # … with 1,218 more rows #> #> attr(,"class") #> [1] "proto_net" #> attr(,"target_class") #> [1] "user"tweet_df %>% as_proto_net(target_class = "hashtag", as_tibble = TRUE)#> $edges #> # A tibble: 267 x 4 #> from to status_id relation #> <chr> <chr> <chr> <chr> #> 1 218889555 sooners 1178007817460039683 uses #> 2 93732206 conejopride 1178007867770597377 uses #> 3 93732206 inclusionishappening 1178007867770597377 uses #> 4 93732206 ireadbannedbooks 1178007867770597377 uses #> 5 1165554985406304257 ป๋อจ้าน 1178007888725344256 uses #> 6 1270106738 theboss 1178007888742293504 uses #> 7 1270106738 jefa 1178007888742293504 uses #> 8 1270106738 nahreptop250 1178007888742293504 uses #> 9 1270106738 nahrep 1178007888742293504 uses #> 10 1270106738 olivaresandmolinateam 1178007888742293504 uses #> # … with 257 more rows #> #> $nodes #> # A tibble: 366 x 1 #> name #> <chr> #> 1 218889555 #> 2 93732206 #> 3 1165554985406304257 #> 4 1270106738 #> 5 1065721261744107521 #> 6 1177166063710035968 #> 7 979945309370572800 #> 8 110467298 #> 9 1177725296306540545 #> 10 16429082 #> # … with 356 more rows #> #> attr(,"class") #> [1] "proto_net" #> attr(,"target_class") #> [1] "hashtag"tweet_df %>% as_proto_net(target_class = "url", as_tibble = TRUE)#> $edges #> # A tibble: 103 x 4 #> from to status_id relation #> <chr> <chr> <chr> <chr> #> 1 401300296 https://www.instagram.com/p/B29ty1WI8Hq/… 117800782164… uses #> 2 4374655520 https://twitter.com/richie_ixii/status/1… 117800783000… uses #> 3 2198859787 https://twitter.com/_trapicuI/status/117… 117800784678… uses #> 4 3237877098 http://M.Tech 117800785100… uses #> 5 2204772030 https://www.instagram.com/p/B29tyfugf-mH… 117800788456… uses #> 6 1270106738 https://www.instagram.com/p/B29t0pfj69T/… 117800788874… uses #> 7 11771660637… https://twitter.com/paritchi/status/1177… 117800792648… uses #> 8 1421542118 https://twitter.com/damndrosetweets/stat… 117800801458… uses #> 9 2551085480 https://twitter.com/tioorochi/status/117… 117800803555… uses #> 10 92930913417… https://www.instagram.com/p/B29t5e-nBMJ/… 117800804391… uses #> # … with 93 more rows #> #> $nodes #> # A tibble: 198 x 1 #> name #> <chr> #> 1 401300296 #> 2 4374655520 #> 3 2198859787 #> 4 3237877098 #> 5 2204772030 #> 6 1270106738 #> 7 1177166063710035968 #> 8 1421542118 #> 9 2551085480 #> 10 929309134175993862 #> # … with 188 more rows #> #> attr(,"class") #> [1] "proto_net" #> attr(,"target_class") #> [1] "url"tweet_df %>% as_proto_net(target_class = "media", as_tibble = TRUE)#> $edges #> # A tibble: 137 x 4 #> from to status_id relation #> <chr> <chr> <chr> <chr> #> 1 11728856250… http://pbs.twimg.com/media/EFjoYwdXoAIMN… 117800783841… uses #> 2 11324745945… http://pbs.twimg.com/ext_tw_video_thumb/… 117800784259… uses #> 3 93732206 http://pbs.twimg.com/ext_tw_video_thumb/… 117800786777… uses #> 4 11214555122… http://pbs.twimg.com/media/EFhtCsZUcAAWB… 117800788455… uses #> 5 10657212617… http://pbs.twimg.com/media/EFkf8M8W4AAIb… 117800790970… uses #> 6 10657212617… http://pbs.twimg.com/media/EFkf8M9XkAA_U… 117800790970… uses #> 7 10657212617… http://pbs.twimg.com/media/EFkf8i9WkAE96… 117800790970… uses #> 8 10429113052… http://pbs.twimg.com/media/EFkf-xWUwAEl_… 117800794325… uses #> 9 110467298 http://pbs.twimg.com/media/EFZmJaaXkAY1g… 117800794746… uses #> 10 98041717153… http://pbs.twimg.com/tweet_video_thumb/E… 117800796425… uses #> # … with 127 more rows #> #> $nodes #> # A tibble: 253 x 1 #> name #> <chr> #> 1 1172885625068036102 #> 2 1132474594583928832 #> 3 93732206 #> 4 1121455512283832320 #> 5 1065721261744107521 #> 6 1042911305202454528 #> 7 110467298 #> 8 980417171531685888 #> 9 2328141540 #> 10 834583812977881090 #> # … with 243 more rows #> #> attr(,"class") #> [1] "proto_net" #> attr(,"target_class") #> [1] "media"tweet_df %>% as_proto_net(all_status_data = TRUE, all_user_data = TRUE, as_tibble = TRUE)#> $edges #> # A tibble: 1,234 x 20 #> from to status_id relation created_at text status_url source #> <chr> <chr> <chr> <chr> <dttm> <chr> <chr> <chr> #> 1 8627… 8453… 11780103… retweet 2019-09-28 18:15:22 RT @… https://t… Twitt… #> 2 8627… 8453… 11780103… mentions 2019-09-28 18:15:22 RT @… https://t… Twitt… #> 3 8627… 1579… 11780103… mentions 2019-09-28 18:15:22 RT @… https://t… Twitt… #> 4 8627… 1508… 11780103… mentions 2019-09-28 18:15:22 RT @… https://t… Twitt… #> 5 8627… 5943… 11780103… mentions 2019-09-28 18:15:22 RT @… https://t… Twitt… #> 6 3688… 8698… 11780103… retweet 2019-09-28 18:15:20 RT @… https://t… Twitt… #> 7 3688… 8698… 11780103… mentions 2019-09-28 18:15:20 RT @… https://t… Twitt… #> 8 3688… 7661… 11780103… mentions 2019-09-28 18:15:20 RT @… https://t… Twitt… #> 9 3688… 1876… 11780103… mentions 2019-09-28 18:15:20 RT @… https://t… Twitt… #> 10 3688… 5882… 11780103… mentions 2019-09-28 18:15:20 RT @… https://t… Twitt… #> # … with 1,224 more rows, and 12 more variables: is_quote <lgl>, #> # is_retweeted <lgl>, media_url <list>, media_type <list>, place_url <chr>, #> # place_name <chr>, place_full_name <chr>, place_type <chr>, country <chr>, #> # country_code <chr>, bbox_coords <list>, status_type <chr> #> #> $nodes #> # A tibble: 1,228 x 20 #> name timestamp_ms name.y screen_name location description url #> <chr> <dttm> <chr> <chr> <chr> <chr> <chr> #> 1 1001… 2019-09-28 18:15:00 ブリアナ.… yoongi_Far… 新宿 "⎷ = 私は秘密の… NA #> 2 1001… 2019-09-28 18:13:04 Local… ornella944… Bronx, … "Snap:orne… NA #> 3 1001… 2019-09-28 18:08:35 RAWAN… rawanjehad… Jeddah "Respirato… http… #> 4 1001… 2019-09-28 18:12:13 Shash… sks_rk Kolkata… "Right to … NA #> 5 1003… 2019-09-28 18:06:45 J🥀 cherrydoze Garland… "3/22🥝" http… #> 6 1003… 2019-09-28 18:13:16 Side03 Side031 Space F… "Twitter d… NA #> 7 1003… 2019-09-28 18:09:58 J.O.K… AlexisAbat… NA "Money Lif… NA #> 8 1003… 2019-09-28 18:07:53 ti_be… t_benjy Bourges… "Bad Vibes… NA #> 9 1003… 2019-09-28 18:11:28 A A R… ManojManuu_ India "@alluarju… NA #> 10 1003… 2019-09-28 18:06:26 백혀_엑소… i8Cg9ETNlz… Taipei … "@B_hundre… http… #> # … with 1,218 more rows, and 13 more variables: protected <lgl>, #> # followers_count <int>, friends_count <int>, listed_count <int>, #> # statuses_count <int>, favourites_count <int>, account_created_at <dttm>, #> # verified <lgl>, profile_url <chr>, account_lang <chr>, #> # profile_banner_url <chr>, profile_image_url <chr>, bbox_coords <list> #> #> attr(,"class") #> [1] "proto_net" #> attr(,"target_class") #> [1] "user"