A proto_net is a list containing two data frames named edges and nodes.

as_proto_net(
  tweet_df,
  target_class = c("user", "hashtag", "url", "media"),
  all_status_data = FALSE,
  all_user_data = FALSE,
  as_tibble = tweetio_as_tibble(),
  ...
)

# S3 method for data.frame
as_proto_net(
  tweet_df,
  target_class = c("user", "hashtag", "url", "media"),
  all_status_data = FALSE,
  all_user_data = FALSE,
  as_tibble = tweetio_as_tibble(),
  ...
)

# S3 method for data.table
as_proto_net(
  tweet_df,
  target_class = c("user", "hashtag", "url", "media"),
  all_status_data = FALSE,
  all_user_data = FALSE,
  as_tibble = tweetio_as_tibble(),
  ...
)

Arguments

tweet_df

A data frame of tweets, as obtained by read_tweets() or one of {rtweet}'s collection functions, e.g. rtweet::search_tweets().

target_class

character(1L), Default: "user". The class of nodes to use as the second half of each dyad (target/to/head). See Details.

all_status_data

logical(1L), Default: FALSE. Whether to attach all relevant status data to the edges data frame, which can then be used as edge attributes for downstream tasks.

all_user_data

logical(1L), Default: FALSE. Whether to attach all relevant user data to the nodes data frame, which can then be uses as node attributes for downstream tasks.

as_tibble

<logical>, Default: tweetio_as_tibble(). Whether a tibble::tibble() should be returned. Ignored if the {tibble} package is not installed.

...

Arguments passed to or from other methods.

Details

  • In a proto_net, users are always to source/from/tail side of dyads. target_class defaults to "user", which creates edges where users on both sides of dyads.

    • However users can also share edges with "hashtag"s, "url"s, or "media", so those values are also valid to provide to target_class to create two-mode/bipartite proto_nets.

  • The edges of a proto_net represent the statuses that form each tie, and status-specific columns are attached to the edges.

  • Casing

    • Twitter hashtags are not case-sensitive, so if target_class is "hashtag" they will be cast to lower-case so they can represent the same node in downstream tasks.

    • URLs (after the domain) can be case-sensitive, so they are left as-is.

      • If you decide to cast URLs to lower-case yourself, exercise caution with Twitter's media URLs, as they contain a case-sensitive hash.

  • all_status_data and all_user_data default to FALSE as they can be performance bottlenecks for large data sets, but they provide a way of building richly decorated networks with the maximum amount of attribute data embedded in the graph structure.

Examples

path_to_tweet_file <- example_tweet_file() tweet_df <- read_tweets(path_to_tweet_file) tweet_df %>% as_proto_net(as_tibble = TRUE)
#> $edges #> # A tibble: 1,234 x 4 #> from to status_id relation #> <chr> <chr> <chr> <chr> #> 1 194250838 340309688 1178007813257388032 retweet #> 2 825459487821619201 966825602 1178007817426370560 retweet #> 3 1116228559616397312 3153145782 1178007830034448384 retweet #> 4 4374655520 4167284315 1178007830005080064 retweet #> 5 1172885625068036102 1112877891841343488 1178007838418984960 retweet #> 6 1132474594583928832 1217220278 1178007842592149505 retweet #> 7 20737729 39334221 1178007846790692864 retweet #> 8 2198859787 3096758526 1178007846782287878 retweet #> 9 3237877098 847885187325276164 1178007851001737217 retweet #> 10 868219009350676481 1962155772 1178007859403132928 retweet #> # … with 1,224 more rows #> #> $nodes #> # A tibble: 1,228 x 1 #> name #> <chr> #> 1 194250838 #> 2 825459487821619201 #> 3 1116228559616397312 #> 4 4374655520 #> 5 1172885625068036102 #> 6 1132474594583928832 #> 7 20737729 #> 8 2198859787 #> 9 3237877098 #> 10 868219009350676481 #> # … with 1,218 more rows #> #> attr(,"class") #> [1] "proto_net" #> attr(,"target_class") #> [1] "user"
tweet_df %>% as_proto_net(target_class = "hashtag", as_tibble = TRUE)
#> $edges #> # A tibble: 267 x 4 #> from to status_id relation #> <chr> <chr> <chr> <chr> #> 1 218889555 sooners 1178007817460039683 uses #> 2 93732206 conejopride 1178007867770597377 uses #> 3 93732206 inclusionishappening 1178007867770597377 uses #> 4 93732206 ireadbannedbooks 1178007867770597377 uses #> 5 1165554985406304257 ป๋อจ้าน 1178007888725344256 uses #> 6 1270106738 theboss 1178007888742293504 uses #> 7 1270106738 jefa 1178007888742293504 uses #> 8 1270106738 nahreptop250 1178007888742293504 uses #> 9 1270106738 nahrep 1178007888742293504 uses #> 10 1270106738 olivaresandmolinateam 1178007888742293504 uses #> # … with 257 more rows #> #> $nodes #> # A tibble: 366 x 1 #> name #> <chr> #> 1 218889555 #> 2 93732206 #> 3 1165554985406304257 #> 4 1270106738 #> 5 1065721261744107521 #> 6 1177166063710035968 #> 7 979945309370572800 #> 8 110467298 #> 9 1177725296306540545 #> 10 16429082 #> # … with 356 more rows #> #> attr(,"class") #> [1] "proto_net" #> attr(,"target_class") #> [1] "hashtag"
tweet_df %>% as_proto_net(target_class = "url", as_tibble = TRUE)
#> $edges #> # A tibble: 103 x 4 #> from to status_id relation #> <chr> <chr> <chr> <chr> #> 1 401300296 https://www.instagram.com/p/B29ty1WI8Hq/… 117800782164… uses #> 2 4374655520 https://twitter.com/richie_ixii/status/1… 117800783000… uses #> 3 2198859787 https://twitter.com/_trapicuI/status/117… 117800784678… uses #> 4 3237877098 http://M.Tech 117800785100… uses #> 5 2204772030 https://www.instagram.com/p/B29tyfugf-mH… 117800788456… uses #> 6 1270106738 https://www.instagram.com/p/B29t0pfj69T/… 117800788874… uses #> 7 11771660637… https://twitter.com/paritchi/status/1177… 117800792648… uses #> 8 1421542118 https://twitter.com/damndrosetweets/stat… 117800801458… uses #> 9 2551085480 https://twitter.com/tioorochi/status/117… 117800803555… uses #> 10 92930913417… https://www.instagram.com/p/B29t5e-nBMJ/… 117800804391… uses #> # … with 93 more rows #> #> $nodes #> # A tibble: 198 x 1 #> name #> <chr> #> 1 401300296 #> 2 4374655520 #> 3 2198859787 #> 4 3237877098 #> 5 2204772030 #> 6 1270106738 #> 7 1177166063710035968 #> 8 1421542118 #> 9 2551085480 #> 10 929309134175993862 #> # … with 188 more rows #> #> attr(,"class") #> [1] "proto_net" #> attr(,"target_class") #> [1] "url"
tweet_df %>% as_proto_net(target_class = "media", as_tibble = TRUE)
#> $edges #> # A tibble: 137 x 4 #> from to status_id relation #> <chr> <chr> <chr> <chr> #> 1 11728856250… http://pbs.twimg.com/media/EFjoYwdXoAIMN… 117800783841… uses #> 2 11324745945… http://pbs.twimg.com/ext_tw_video_thumb/… 117800784259… uses #> 3 93732206 http://pbs.twimg.com/ext_tw_video_thumb/… 117800786777… uses #> 4 11214555122… http://pbs.twimg.com/media/EFhtCsZUcAAWB… 117800788455… uses #> 5 10657212617… http://pbs.twimg.com/media/EFkf8M8W4AAIb… 117800790970… uses #> 6 10657212617… http://pbs.twimg.com/media/EFkf8M9XkAA_U… 117800790970… uses #> 7 10657212617… http://pbs.twimg.com/media/EFkf8i9WkAE96… 117800790970… uses #> 8 10429113052… http://pbs.twimg.com/media/EFkf-xWUwAEl_… 117800794325… uses #> 9 110467298 http://pbs.twimg.com/media/EFZmJaaXkAY1g… 117800794746… uses #> 10 98041717153… http://pbs.twimg.com/tweet_video_thumb/E… 117800796425… uses #> # … with 127 more rows #> #> $nodes #> # A tibble: 253 x 1 #> name #> <chr> #> 1 1172885625068036102 #> 2 1132474594583928832 #> 3 93732206 #> 4 1121455512283832320 #> 5 1065721261744107521 #> 6 1042911305202454528 #> 7 110467298 #> 8 980417171531685888 #> 9 2328141540 #> 10 834583812977881090 #> # … with 243 more rows #> #> attr(,"class") #> [1] "proto_net" #> attr(,"target_class") #> [1] "media"
tweet_df %>% as_proto_net(all_status_data = TRUE, all_user_data = TRUE, as_tibble = TRUE)
#> $edges #> # A tibble: 1,234 x 20 #> from to status_id relation created_at text status_url source #> <chr> <chr> <chr> <chr> <dttm> <chr> <chr> <chr> #> 1 8627… 8453… 11780103… retweet 2019-09-28 18:15:22 RT @… https://t… Twitt… #> 2 8627… 8453… 11780103… mentions 2019-09-28 18:15:22 RT @… https://t… Twitt… #> 3 8627… 1579… 11780103… mentions 2019-09-28 18:15:22 RT @… https://t… Twitt… #> 4 8627… 1508… 11780103… mentions 2019-09-28 18:15:22 RT @… https://t… Twitt… #> 5 8627… 5943… 11780103… mentions 2019-09-28 18:15:22 RT @… https://t… Twitt… #> 6 3688… 8698… 11780103… retweet 2019-09-28 18:15:20 RT @… https://t… Twitt… #> 7 3688… 8698… 11780103… mentions 2019-09-28 18:15:20 RT @… https://t… Twitt… #> 8 3688… 7661… 11780103… mentions 2019-09-28 18:15:20 RT @… https://t… Twitt… #> 9 3688… 1876… 11780103… mentions 2019-09-28 18:15:20 RT @… https://t… Twitt… #> 10 3688… 5882… 11780103… mentions 2019-09-28 18:15:20 RT @… https://t… Twitt… #> # … with 1,224 more rows, and 12 more variables: is_quote <lgl>, #> # is_retweeted <lgl>, media_url <list>, media_type <list>, place_url <chr>, #> # place_name <chr>, place_full_name <chr>, place_type <chr>, country <chr>, #> # country_code <chr>, bbox_coords <list>, status_type <chr> #> #> $nodes #> # A tibble: 1,228 x 20 #> name timestamp_ms name.y screen_name location description url #> <chr> <dttm> <chr> <chr> <chr> <chr> <chr> #> 1 1001… 2019-09-28 18:15:00 ブリアナ.… yoongi_Far… 新宿 "⎷ = 私は秘密の… NA #> 2 1001… 2019-09-28 18:13:04 Local… ornella944… Bronx, … "Snap:orne… NA #> 3 1001… 2019-09-28 18:08:35 RAWAN… rawanjehad… Jeddah "Respirato… http… #> 4 1001… 2019-09-28 18:12:13 Shash… sks_rk Kolkata… "Right to … NA #> 5 1003… 2019-09-28 18:06:45 J🥀 cherrydoze Garland… "3/22🥝" http… #> 6 1003… 2019-09-28 18:13:16 Side03 Side031 Space F… "Twitter d… NA #> 7 1003… 2019-09-28 18:09:58 J.O.K… AlexisAbat… NA "Money Lif… NA #> 8 1003… 2019-09-28 18:07:53 ti_be… t_benjy Bourges… "Bad Vibes… NA #> 9 1003… 2019-09-28 18:11:28 A A R… ManojManuu_ India "@alluarju… NA #> 10 1003… 2019-09-28 18:06:26 백혀_엑소… i8Cg9ETNlz… Taipei … "@B_hundre… http… #> # … with 1,218 more rows, and 13 more variables: protected <lgl>, #> # followers_count <int>, friends_count <int>, listed_count <int>, #> # statuses_count <int>, favourites_count <int>, account_created_at <dttm>, #> # verified <lgl>, profile_url <chr>, account_lang <chr>, #> # profile_banner_url <chr>, profile_image_url <chr>, bbox_coords <list> #> #> attr(,"class") #> [1] "proto_net" #> attr(,"target_class") #> [1] "user"