The messages in {ironmarch}'s data are HTML. The extract_*() functions facilitate working with them.

extract_html_tags(x, tag, collapse = "\n", ...)

extract_text(x, collapse = "\n", ...)

extract_links(x, collapse = NULL, ...)

extract_reply_text(x, collapse = "\n", ...)

Arguments

x

character(). HTML text to parse.

tag

character(1L). HTML tag to extract.

collapse

character(1L) or NULL, Default: "\n". String to collapse multiple results by. If NULL, a list() is returned.

...

Arguments passed to or from other methods.

Value

If collapse is not NULL, a character(). Otherwise, a list() containing character()s.

Examples

messages <- im_core_dfs$core_message_posts$msg_post[c(51, 119)] messages
#> [1] "\n<p>AS for the book you sent a link to: it is interesting but doesn't have anything to do with corporatism, just in the sense of corporate bodies being involved in government.</p>\n<p>For the time being you may post it in the Fascist Economics subforum <a href=\"<___base_url___>/index.php?/forum/25-fascist-economics/\" rel=\"\">http://ironmarch.org/index.php?/forum/25-fascist-economics/</a> , and if we decide that it doesn't belong it will be moved elsewhere.</p>\n" #> [2] "\n<p>By Mosley's steps you mean fascism? yes, that entirely - It is my religion and Mosley is my prophet. A hardened, militant fighting movement for integral nationalism and principles that will drive away people who will only weaken us.</p>\n<p>I am not sure how much you have been looking out but I had an interview on VOR recently.</p>\n<p><a href=\"http://www.youtube.com/watch?v=TmuF4TzC7uE\" rel=\"external nofollow\">http://www.youtube.com/watch?v=TmuF4TzC7uE</a></p>\n<p><a href=\"http://www.youtube.com/watch?v=4OlvSJuX1xA\" rel=\"external nofollow\">http://www.youtube.com/watch?v=4OlvSJuX1xA</a></p>\n<p><a href=\"http://www.youtube.com/watch?v=m6gIHfi681Y\" rel=\"external nofollow\">http://www.youtube.com/watch?v=m6gIHfi681Y</a></p>\n<p><a href=\"http://www.youtube.com/watch?v=VjNw5TnC3rA\" rel=\"external nofollow\">http://www.youtube.com/watch?v=VjNw5TnC3rA</a></p>\n<p>After interview discussion.</p>\n<p><a href=\"http://www.youtube.com/watch?v=r13yzkH-rVI\" rel=\"external nofollow\">http://www.youtube.com/watch?v=r13yzkH-rVI</a></p>\n<p><a href=\"http://www.youtube.com/watch?v=uBgwuo5Jc6Y\" rel=\"external nofollow\">http://www.youtube.com/watch?v=uBgwuo5Jc6Y</a></p>\n<p><a href=\"http://www.youtube.com/watch?v=t-6LFEfcVPk\" rel=\"external nofollow\">http://www.youtube.com/watch?v=t-6LFEfcVPk</a></p>\n<p>I have recently rewritten the old attacks to be more clear, and we have a policy on the website.</p>\n<p><a href=\"http://integralistparty.zzl.org/ourpolicy.html\" rel=\"external nofollow\">http://integralistparty.zzl.org/ourpolicy.html</a></p>\n<p><a href=\"http://integralistparty.zzl.org/attack.html\" rel=\"external nofollow\">http://integralistparty.zzl.org/attack.html</a></p>\n"
extract_html_tags(messages, "p")
#> [1] "AS for the book you sent a link to: it is interesting but doesn't have anything to do with corporatism, just in the sense of corporate bodies being involved in government.\nFor the time being you may post it in the Fascist Economics subforum http://ironmarch.org/index.php?/forum/25-fascist-economics/ , and if we decide that it doesn't belong it will be moved elsewhere." #> [2] "By Mosley's steps you mean fascism? yes, that entirely - It is my religion and Mosley is my prophet. A hardened, militant fighting movement for integral nationalism and principles that will drive away people who will only weaken us.\nI am not sure how much you have been looking out but I had an interview on VOR recently.\nhttp://www.youtube.com/watch?v=TmuF4TzC7uE\nhttp://www.youtube.com/watch?v=4OlvSJuX1xA\nhttp://www.youtube.com/watch?v=m6gIHfi681Y\nhttp://www.youtube.com/watch?v=VjNw5TnC3rA\nAfter interview discussion.\nhttp://www.youtube.com/watch?v=r13yzkH-rVI\nhttp://www.youtube.com/watch?v=uBgwuo5Jc6Y\nhttp://www.youtube.com/watch?v=t-6LFEfcVPk\nI have recently rewritten the old attacks to be more clear, and we have a policy on the website.\nhttp://integralistparty.zzl.org/ourpolicy.html\nhttp://integralistparty.zzl.org/attack.html"
extract_text(messages) # same as `extract_html_tags(messages, tag = "p")`
#> [1] "AS for the book you sent a link to: it is interesting but doesn't have anything to do with corporatism, just in the sense of corporate bodies being involved in government.\nFor the time being you may post it in the Fascist Economics subforum http://ironmarch.org/index.php?/forum/25-fascist-economics/ , and if we decide that it doesn't belong it will be moved elsewhere." #> [2] "By Mosley's steps you mean fascism? yes, that entirely - It is my religion and Mosley is my prophet. A hardened, militant fighting movement for integral nationalism and principles that will drive away people who will only weaken us.\nI am not sure how much you have been looking out but I had an interview on VOR recently.\nhttp://www.youtube.com/watch?v=TmuF4TzC7uE\nhttp://www.youtube.com/watch?v=4OlvSJuX1xA\nhttp://www.youtube.com/watch?v=m6gIHfi681Y\nhttp://www.youtube.com/watch?v=VjNw5TnC3rA\nAfter interview discussion.\nhttp://www.youtube.com/watch?v=r13yzkH-rVI\nhttp://www.youtube.com/watch?v=uBgwuo5Jc6Y\nhttp://www.youtube.com/watch?v=t-6LFEfcVPk\nI have recently rewritten the old attacks to be more clear, and we have a policy on the website.\nhttp://integralistparty.zzl.org/ourpolicy.html\nhttp://integralistparty.zzl.org/attack.html"
extract_links(messages) # same as `extract_html_tags(messages, tag = "a", collapse = NULL)`
#> [[1]] #> [1] "http://ironmarch.org/index.php?/forum/25-fascist-economics/" #> #> [[2]] #> [1] "http://www.youtube.com/watch?v=TmuF4TzC7uE" #> [2] "http://www.youtube.com/watch?v=4OlvSJuX1xA" #> [3] "http://www.youtube.com/watch?v=m6gIHfi681Y" #> [4] "http://www.youtube.com/watch?v=VjNw5TnC3rA" #> [5] "http://www.youtube.com/watch?v=r13yzkH-rVI" #> [6] "http://www.youtube.com/watch?v=uBgwuo5Jc6Y" #> [7] "http://www.youtube.com/watch?v=t-6LFEfcVPk" #> [8] "http://integralistparty.zzl.org/ourpolicy.html" #> [9] "http://integralistparty.zzl.org/attack.html" #>