[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"blog-post-crash-course-in-cfml-and-jsoup":3},{"id":4,"title":5,"body":6,"date":3129,"description":12,"extension":3130,"meta":3131,"navigation":360,"path":3135,"published":360,"seo":3136,"slug":3137,"stem":3138,"summary":5,"tags":3139,"__hash__":3142},"blog/blog/2017-08-18-crash-course-in-cfml-and-jsoup.md","Crash Course In CFML & jsoup",{"type":7,"value":8,"toc":3118},"minimark",[9,13,18,21,24,40,43,62,66,68,71,76,81,101,111,114,210,212,216,229,231,235,246,276,278,288,292,294,299,302,306,489,498,502,505,734,737,894,897,901,908,1017,1024,1026,1030,1033,1037,1188,1201,1205,1212,1409,1425,1429,1435,1524,1534,1538,1541,1712,1728,1732,1743,1746,1968,1970,2201,2218,2220,2224,2239,2480,2503,2506,2542,2544,2548,2551,2555,2566,2806,2820,2824,2836,3075,3084,3088,3090,3095,3103,3111,3114],[10,11,12],"p",{},"Over the years I've made repeatable use of the jsoup library so I figured it'd be nice to put out a little primer on using it with CFML.",[14,15,17],"h2",{"id":16},"what-is-jsoup","What Is jsoup?",[19,20],"hr",{},[10,22,23],{},"From the official site:",[25,26,27,37],"blockquote",{},[10,28,29,36],{},[30,31,35],"a",{"href":32,"rel":33},"https://jsoup.org/",[34],"nofollow","jsoup"," is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.",[10,38,39],{},"jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree.",[10,41,42],{},"jsoup allows you to do such things as:",[44,45,46,50,53,56,59],"ul",{},[47,48,49],"li",{},"Scrape and parse HTML from a URL, file, or string",[47,51,52],{},"Find and extract data, using DOM traversal or CSS selectors",[47,54,55],{},"Manipulate the HTML elements, attributes, and text",[47,57,58],{},"Clean user-submitted content against a safe white-list, to prevent XSS attacks",[47,60,61],{},"Output tidy HTML",[14,63,65],{"id":64},"getting-started","Getting Started",[19,67],{},[10,69,70],{},"There are a few ways to go about integrating jsoup into an application.",[72,73,75],"h4",{"id":74},"a-installing-with-commandbox","A) Installing With CommandBox",[25,77,78],{},[10,79,80],{},"Note: This assumes CommandBox version 3.7+",[10,82,83,84,88,89,92,93,96,97,100],{},"From the CommandBox CLI, create a new project directory: ",[85,86,87],"code",{},"mkdir cfml-jsoup-example",", and ",[85,90,91],{},"cd"," to the new folder. From there run ",[85,94,95],{},"init cfml-jsoup-example"," to create a ",[85,98,99],{},"box.json"," for your project.",[102,103,108],"pre",{"className":104,"code":106,"language":107},[105],"language-text","#> mkdir cfml-jsoup-example\n#> cd cfml-jsoup-example\n#> init cfml-jsoup-example\n","text",[85,109,106],{"__ignoreMap":110},"",[10,112,113],{},"Inside of box.json you want to a dependency with a JAR endpoint that points to the URL where the JAR file is located. The installed JAR will always be homed in a directory named after the JAR file, but you can place that folder in any \"root\" folder of your choice.",[102,115,119],{"className":116,"code":117,"language":118,"meta":110,"style":110},"language-json shiki shiki-themes github-light github-dark github-dark monokai","{\n  \"name\": \"cfml-jsoup-example\",\n  \"dependencies\": {\n    \"jsoup-1.10.3\": \"jar:https://jsoup.org/packages/jsoup-1.10.3.jar\"\n  },\n  \"installPaths\": {\n    \"jsoup-1.10.3\": \"lib\\\\jsoup-1.10.3\"\n  }\n}\n","json",[85,120,121,130,147,156,167,173,181,198,204],{"__ignoreMap":110},[122,123,126],"span",{"class":124,"line":125},"line",1,[122,127,129],{"class":128},"s__4a","{\n",[122,131,133,137,140,144],{"class":124,"line":132},2,[122,134,136],{"class":135},"sHysA","  \"name\"",[122,138,139],{"class":128},": ",[122,141,143],{"class":142},"sSnWl","\"cfml-jsoup-example\"",[122,145,146],{"class":128},",\n",[122,148,150,153],{"class":124,"line":149},3,[122,151,152],{"class":135},"  \"dependencies\"",[122,154,155],{"class":128},": {\n",[122,157,159,162,164],{"class":124,"line":158},4,[122,160,161],{"class":135},"    \"jsoup-1.10.3\"",[122,163,139],{"class":128},[122,165,166],{"class":142},"\"jar:https://jsoup.org/packages/jsoup-1.10.3.jar\"\n",[122,168,170],{"class":124,"line":169},5,[122,171,172],{"class":128},"  },\n",[122,174,176,179],{"class":124,"line":175},6,[122,177,178],{"class":135},"  \"installPaths\"",[122,180,155],{"class":128},[122,182,184,186,188,191,195],{"class":124,"line":183},7,[122,185,161],{"class":135},[122,187,139],{"class":128},[122,189,190],{"class":142},"\"lib",[122,192,194],{"class":193},"s-7EV","\\\\",[122,196,197],{"class":142},"jsoup-1.10.3\"\n",[122,199,201],{"class":124,"line":200},8,[122,202,203],{"class":128},"  }\n",[122,205,207],{"class":124,"line":206},9,[122,208,209],{"class":128},"}\n",[19,211],{},[72,213,215],{"id":214},"b-installing-manually","B) Installing Manually",[10,217,218,219,224,225,228],{},"To manually install jsoup, you can simply go to the ",[30,220,223],{"href":221,"rel":222},"https://jsoup.org/download",[34],"official site download page"," and pull down the latest ",[85,226,227],{},"core library"," release. Place the JAR file in a folder of your choice within your project.",[19,230],{},[72,232,234],{"id":233},"mapping-the-jar-in-applicationcfc","Mapping the JAR In Application.cfc",[10,236,237,238,241,242,245],{},"Now we need to add the JAR to the Java Class Path. You can map it in your project's ",[85,239,240],{},"Application.cfc"," via ",[85,243,244],{},"this.javaSettings",".",[102,247,251],{"className":248,"code":249,"language":250,"meta":110,"style":110},"language-js shiki shiki-themes github-light github-dark github-dark monokai","this.javaSettings = { loadPaths: ['./your_dir'] };\n","js",[85,252,253],{"__ignoreMap":110},[122,254,255,259,262,266,269,273],{"class":124,"line":125},[122,256,258],{"class":257},"s0XdW","this",[122,260,261],{"class":128},".javaSettings ",[122,263,265],{"class":264},"sOw68","=",[122,267,268],{"class":128}," { loadPaths: [",[122,270,272],{"class":271},"svw4x","'./your_dir'",[122,274,275],{"class":128},"] };\n",[19,277],{},[25,279,280],{},[10,281,282,283,245],{},"To learn more on integrating 3rd party Java libraries in CFML, check out the ",[30,284,287],{"href":285,"rel":286},"https://cfdocs.org/java",[34],"CFDocs - Java Integration Guide",[14,289,291],{"id":290},"examples","Examples",[19,293],{},[295,296,298],"h3",{"id":297},"parsing-documents","Parsing Documents",[10,300,301],{},"A jsoup document can be a string of HTML-like data or data read in from a file as a string.",[72,303,305],{"id":304},"parse-a-document-from-an-html-string","Parse A Document From An HTML String",[102,307,311],{"className":308,"code":309,"language":310,"meta":110,"style":110},"language-java shiki shiki-themes github-light github-dark github-dark monokai","\u003Ccfscript>\n    // Create the jsoup object\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n\n    // HTML string\n    html = \"\u003Chtml>\u003Chead>\u003Ctitle>CFML & jsoup Example\u003C/title>\u003C/head>\u003Cbody>Content about CFML and jsoup.\u003C/body>\u003C/html>\";\n    // Parse the string\n    document = Jsoup.parse(html);\n    // Extract content\n    title = document.title();\n    body = document.body().text();\n\n    writeOutput(\"\n        \u003Cdiv>Title: #title#\u003C/div>\n        \u003Cdiv>Body: #body#\u003C/div>\n    \");\n\u003C/cfscript>\n","java",[85,312,313,324,330,356,362,367,380,385,401,406,423,443,448,459,465,471,479],{"__ignoreMap":110},[122,314,315,318,321],{"class":124,"line":125},[122,316,317],{"class":264},"\u003C",[122,319,320],{"class":128},"cfscript",[122,322,323],{"class":264},">\n",[122,325,326],{"class":124,"line":132},[122,327,329],{"class":328},"sq-8i","    // Create the jsoup object\n",[122,331,332,335,337,341,344,347,350,353],{"class":124,"line":149},[122,333,334],{"class":128},"    Jsoup ",[122,336,265],{"class":264},[122,338,340],{"class":339},"s9ihy"," createObject",[122,342,343],{"class":128},"(",[122,345,346],{"class":271},"\"java\"",[122,348,349],{"class":128},", ",[122,351,352],{"class":271},"\"org.jsoup.Jsoup\"",[122,354,355],{"class":128},");\n",[122,357,358],{"class":124,"line":158},[122,359,361],{"emptyLinePlaceholder":360},true,"\n",[122,363,364],{"class":124,"line":169},[122,365,366],{"class":328},"    // HTML string\n",[122,368,369,372,374,377],{"class":124,"line":175},[122,370,371],{"class":128},"    html ",[122,373,265],{"class":264},[122,375,376],{"class":271}," \"\u003Chtml>\u003Chead>\u003Ctitle>CFML & jsoup Example\u003C/title>\u003C/head>\u003Cbody>Content about CFML and jsoup.\u003C/body>\u003C/html>\"",[122,378,379],{"class":128},";\n",[122,381,382],{"class":124,"line":183},[122,383,384],{"class":328},"    // Parse the string\n",[122,386,387,390,392,395,398],{"class":124,"line":200},[122,388,389],{"class":128},"    document ",[122,391,265],{"class":264},[122,393,394],{"class":128}," Jsoup.",[122,396,397],{"class":339},"parse",[122,399,400],{"class":128},"(html);\n",[122,402,403],{"class":124,"line":206},[122,404,405],{"class":328},"    // Extract content\n",[122,407,409,412,414,417,420],{"class":124,"line":408},10,[122,410,411],{"class":128},"    title ",[122,413,265],{"class":264},[122,415,416],{"class":128}," document.",[122,418,419],{"class":339},"title",[122,421,422],{"class":128},"();\n",[122,424,426,429,431,433,436,439,441],{"class":124,"line":425},11,[122,427,428],{"class":128},"    body ",[122,430,265],{"class":264},[122,432,416],{"class":128},[122,434,435],{"class":339},"body",[122,437,438],{"class":128},"().",[122,440,107],{"class":339},[122,442,422],{"class":128},[122,444,446],{"class":124,"line":445},12,[122,447,361],{"emptyLinePlaceholder":360},[122,449,451,454,456],{"class":124,"line":450},13,[122,452,453],{"class":339},"    writeOutput",[122,455,343],{"class":128},[122,457,458],{"class":271},"\"\n",[122,460,462],{"class":124,"line":461},14,[122,463,464],{"class":271},"        \u003Cdiv>Title: #title#\u003C/div>\n",[122,466,468],{"class":124,"line":467},15,[122,469,470],{"class":271},"        \u003Cdiv>Body: #body#\u003C/div>\n",[122,472,474,477],{"class":124,"line":473},16,[122,475,476],{"class":271},"    \"",[122,478,355],{"class":128},[122,480,482,485,487],{"class":124,"line":481},17,[122,483,484],{"class":264},"\u003C/",[122,486,320],{"class":128},[122,488,323],{"class":264},[10,490,491,492,497],{},"The example code instantiates the Jsoup class and parses a string of HTML. This returns a ",[30,493,496],{"href":494,"rel":495},"https://jsoup.org/apidocs/org/jsoup/nodes/Document.html",[34],"Document class object"," that we can act on with its methods.",[72,499,501],{"id":500},"parsing-a-document-from-html-files","Parsing A Document From HTML Files",[10,503,504],{},"Consider the following example HTML...",[102,506,510],{"className":507,"code":508,"language":509,"meta":110,"style":110},"language-html shiki shiki-themes github-light github-dark github-dark monokai","\u003C!doctype html>\n\u003Chtml>\n  \u003Chead>\n    \u003Ctitle>CFML & jsoup Example\u003C/title>\n    \u003Cmeta charset=\"UTF-8\" />\n    \u003Cmeta name=\"keywords\" content=\"jsoup,cfml,java,html\" />\n    \u003Cmeta name=\"description\" content=\"Examples for using CFML and jsoup.\" />\n    \u003Cmeta name=\"author\" content=\"@tonyjunkes\" />\n  \u003C/head>\n  \u003Cbody>\n    \u003Cheader id=\"header\">Getting Started With CFML & jsoup\u003C/header>\n    \u003Cdiv>Some content...\u003C/div>\n    \u003Ca href=\"#\">A link to useful info\u003C/a>\n  \u003C/body>\n\u003C/html>\n","html",[85,511,512,526,534,544,558,576,600,622,644,653,661,683,697,718,726],{"__ignoreMap":110},[122,513,514,517,521,524],{"class":124,"line":125},[122,515,516],{"class":128},"\u003C!",[122,518,520],{"class":519},"sIn_X","doctype",[122,522,523],{"class":339}," html",[122,525,323],{"class":128},[122,527,528,530,532],{"class":124,"line":132},[122,529,317],{"class":128},[122,531,509],{"class":519},[122,533,323],{"class":128},[122,535,536,539,542],{"class":124,"line":149},[122,537,538],{"class":128},"  \u003C",[122,540,541],{"class":519},"head",[122,543,323],{"class":128},[122,545,546,549,551,554,556],{"class":124,"line":158},[122,547,548],{"class":128},"    \u003C",[122,550,419],{"class":519},[122,552,553],{"class":128},">CFML & jsoup Example\u003C/",[122,555,419],{"class":519},[122,557,323],{"class":128},[122,559,560,562,565,568,570,573],{"class":124,"line":169},[122,561,548],{"class":128},[122,563,564],{"class":519},"meta",[122,566,567],{"class":339}," charset",[122,569,265],{"class":128},[122,571,572],{"class":271},"\"UTF-8\"",[122,574,575],{"class":128}," />\n",[122,577,578,580,582,585,587,590,593,595,598],{"class":124,"line":175},[122,579,548],{"class":128},[122,581,564],{"class":519},[122,583,584],{"class":339}," name",[122,586,265],{"class":128},[122,588,589],{"class":271},"\"keywords\"",[122,591,592],{"class":339}," content",[122,594,265],{"class":128},[122,596,597],{"class":271},"\"jsoup,cfml,java,html\"",[122,599,575],{"class":128},[122,601,602,604,606,608,610,613,615,617,620],{"class":124,"line":183},[122,603,548],{"class":128},[122,605,564],{"class":519},[122,607,584],{"class":339},[122,609,265],{"class":128},[122,611,612],{"class":271},"\"description\"",[122,614,592],{"class":339},[122,616,265],{"class":128},[122,618,619],{"class":271},"\"Examples for using CFML and jsoup.\"",[122,621,575],{"class":128},[122,623,624,626,628,630,632,635,637,639,642],{"class":124,"line":200},[122,625,548],{"class":128},[122,627,564],{"class":519},[122,629,584],{"class":339},[122,631,265],{"class":128},[122,633,634],{"class":271},"\"author\"",[122,636,592],{"class":339},[122,638,265],{"class":128},[122,640,641],{"class":271},"\"@tonyjunkes\"",[122,643,575],{"class":128},[122,645,646,649,651],{"class":124,"line":206},[122,647,648],{"class":128},"  \u003C/",[122,650,541],{"class":519},[122,652,323],{"class":128},[122,654,655,657,659],{"class":124,"line":408},[122,656,538],{"class":128},[122,658,435],{"class":519},[122,660,323],{"class":128},[122,662,663,665,668,671,673,676,679,681],{"class":124,"line":425},[122,664,548],{"class":128},[122,666,667],{"class":519},"header",[122,669,670],{"class":339}," id",[122,672,265],{"class":128},[122,674,675],{"class":271},"\"header\"",[122,677,678],{"class":128},">Getting Started With CFML & jsoup\u003C/",[122,680,667],{"class":519},[122,682,323],{"class":128},[122,684,685,687,690,693,695],{"class":124,"line":445},[122,686,548],{"class":128},[122,688,689],{"class":519},"div",[122,691,692],{"class":128},">Some content...\u003C/",[122,694,689],{"class":519},[122,696,323],{"class":128},[122,698,699,701,703,706,708,711,714,716],{"class":124,"line":450},[122,700,548],{"class":128},[122,702,30],{"class":519},[122,704,705],{"class":339}," href",[122,707,265],{"class":128},[122,709,710],{"class":271},"\"#\"",[122,712,713],{"class":128},">A link to useful info\u003C/",[122,715,30],{"class":519},[122,717,323],{"class":128},[122,719,720,722,724],{"class":124,"line":461},[122,721,648],{"class":128},[122,723,435],{"class":519},[122,725,323],{"class":128},[122,727,728,730,732],{"class":124,"line":467},[122,729,484],{"class":128},[122,731,509],{"class":519},[122,733,323],{"class":128},[10,735,736],{},"And CFML...",[102,738,740],{"className":308,"code":739,"language":310,"meta":110,"style":110},"\u003Ccfscript>\n    // Create the jsoup object\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n    // Create the File object\n    JFile = createObject(\"java\", \"java.io.File\");\n\n    // Get the absolute file path\n    fileName = expandPath(\"./path/to/file.html\");\n    // Parse the File object and extract data\n    document = Jsoup.parse(JFile.init(fileName), \"utf-8\");\n    header = document.getElementById(\"header\");\n\n    writeOutput(header.text());\n\u003C/cfscript>\n",[85,741,742,750,754,772,777,797,801,806,823,828,852,870,874,886],{"__ignoreMap":110},[122,743,744,746,748],{"class":124,"line":125},[122,745,317],{"class":264},[122,747,320],{"class":128},[122,749,323],{"class":264},[122,751,752],{"class":124,"line":132},[122,753,329],{"class":328},[122,755,756,758,760,762,764,766,768,770],{"class":124,"line":149},[122,757,334],{"class":128},[122,759,265],{"class":264},[122,761,340],{"class":339},[122,763,343],{"class":128},[122,765,346],{"class":271},[122,767,349],{"class":128},[122,769,352],{"class":271},[122,771,355],{"class":128},[122,773,774],{"class":124,"line":158},[122,775,776],{"class":328},"    // Create the File object\n",[122,778,779,782,784,786,788,790,792,795],{"class":124,"line":169},[122,780,781],{"class":128},"    JFile ",[122,783,265],{"class":264},[122,785,340],{"class":339},[122,787,343],{"class":128},[122,789,346],{"class":271},[122,791,349],{"class":128},[122,793,794],{"class":271},"\"java.io.File\"",[122,796,355],{"class":128},[122,798,799],{"class":124,"line":175},[122,800,361],{"emptyLinePlaceholder":360},[122,802,803],{"class":124,"line":183},[122,804,805],{"class":328},"    // Get the absolute file path\n",[122,807,808,811,813,816,818,821],{"class":124,"line":200},[122,809,810],{"class":128},"    fileName ",[122,812,265],{"class":264},[122,814,815],{"class":339}," expandPath",[122,817,343],{"class":128},[122,819,820],{"class":271},"\"./path/to/file.html\"",[122,822,355],{"class":128},[122,824,825],{"class":124,"line":206},[122,826,827],{"class":328},"    // Parse the File object and extract data\n",[122,829,830,832,834,836,838,841,844,847,850],{"class":124,"line":408},[122,831,389],{"class":128},[122,833,265],{"class":264},[122,835,394],{"class":128},[122,837,397],{"class":339},[122,839,840],{"class":128},"(JFile.",[122,842,843],{"class":339},"init",[122,845,846],{"class":128},"(fileName), ",[122,848,849],{"class":271},"\"utf-8\"",[122,851,355],{"class":128},[122,853,854,857,859,861,864,866,868],{"class":124,"line":425},[122,855,856],{"class":128},"    header ",[122,858,265],{"class":264},[122,860,416],{"class":128},[122,862,863],{"class":339},"getElementById",[122,865,343],{"class":128},[122,867,675],{"class":271},[122,869,355],{"class":128},[122,871,872],{"class":124,"line":445},[122,873,361],{"emptyLinePlaceholder":360},[122,875,876,878,881,883],{"class":124,"line":450},[122,877,453],{"class":339},[122,879,880],{"class":128},"(header.",[122,882,107],{"class":339},[122,884,885],{"class":128},"());\n",[122,887,888,890,892],{"class":124,"line":461},[122,889,484],{"class":264},[122,891,320],{"class":128},[122,893,323],{"class":264},[10,895,896],{},"The example code demonstrates using jsoup to parse a Java File object that has the path to the HTML file set as the constructor parameter. This returns a Document to act on with its methods.",[72,898,900],{"id":899},"parsing-an-external-source","Parsing An External Source",[10,902,903,904,907],{},"We can connect to an external source using jsoup's ",[85,905,906],{},"connect()"," method.",[102,909,911],{"className":308,"code":910,"language":310,"meta":110,"style":110},"\u003Ccfscript>\n    // Create the jsoup object\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n\n    // Connect\n    siteAddress = \"https://jsoup.org/\";\n    document = Jsoup.connect(siteAddress).get();\n\n    // Do things to act on the Document...\n\n    // Dump the object\n    writeDump(document);\n\u003C/cfscript>\n",[85,912,913,921,925,943,947,952,964,983,987,992,996,1001,1009],{"__ignoreMap":110},[122,914,915,917,919],{"class":124,"line":125},[122,916,317],{"class":264},[122,918,320],{"class":128},[122,920,323],{"class":264},[122,922,923],{"class":124,"line":132},[122,924,329],{"class":328},[122,926,927,929,931,933,935,937,939,941],{"class":124,"line":149},[122,928,334],{"class":128},[122,930,265],{"class":264},[122,932,340],{"class":339},[122,934,343],{"class":128},[122,936,346],{"class":271},[122,938,349],{"class":128},[122,940,352],{"class":271},[122,942,355],{"class":128},[122,944,945],{"class":124,"line":158},[122,946,361],{"emptyLinePlaceholder":360},[122,948,949],{"class":124,"line":169},[122,950,951],{"class":328},"    // Connect\n",[122,953,954,957,959,962],{"class":124,"line":175},[122,955,956],{"class":128},"    siteAddress ",[122,958,265],{"class":264},[122,960,961],{"class":271}," \"https://jsoup.org/\"",[122,963,379],{"class":128},[122,965,966,968,970,972,975,978,981],{"class":124,"line":183},[122,967,389],{"class":128},[122,969,265],{"class":264},[122,971,394],{"class":128},[122,973,974],{"class":339},"connect",[122,976,977],{"class":128},"(siteAddress).",[122,979,980],{"class":339},"get",[122,982,422],{"class":128},[122,984,985],{"class":124,"line":200},[122,986,361],{"emptyLinePlaceholder":360},[122,988,989],{"class":124,"line":206},[122,990,991],{"class":328},"    // Do things to act on the Document...\n",[122,993,994],{"class":124,"line":408},[122,995,361],{"emptyLinePlaceholder":360},[122,997,998],{"class":124,"line":425},[122,999,1000],{"class":328},"    // Dump the object\n",[122,1002,1003,1006],{"class":124,"line":445},[122,1004,1005],{"class":339},"    writeDump",[122,1007,1008],{"class":128},"(document);\n",[122,1010,1011,1013,1015],{"class":124,"line":450},[122,1012,484],{"class":264},[122,1014,320],{"class":128},[122,1016,323],{"class":264},[10,1018,1019,1020,1023],{},"So we take a website URL address and pass it to ",[85,1021,1022],{},"Jsoup.connect()"," and, so long as the site resolves to a valid page, we are returned a Document object to act on. The example above only dumps the returned object to show various functions available to use on the collected content.",[19,1025],{},[295,1027,1029],{"id":1028},"a-deeper-look-at-working-with-documents","A Deeper Look At Working With Documents",[10,1031,1032],{},"Using the same example HTML file content displayed earlier, we will grab various meta data from a Document object.",[72,1034,1036],{"id":1035},"reading-from-the-head-element","Reading From the \u003Chead> Element",[102,1038,1040],{"className":308,"code":1039,"language":310,"meta":110,"style":110},"\u003Ccfscript>\n    // Create object, pass in file and parse\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n    JFile = createObject(\"java\", \"java.io.File\");\n\n    fileName = expandPath(\"./path/to/file.html\");\n    document = Jsoup.parse(JFile.init(fileName), \"utf-8\");\n\n    title = document.title();\n    head = document.head();\n\n    writeOutput(\"Title: #title#\");\n    writeDump(head);\n\u003C/cfscript>\n",[85,1041,1042,1050,1055,1073,1091,1095,1109,1129,1133,1145,1158,1162,1173,1180],{"__ignoreMap":110},[122,1043,1044,1046,1048],{"class":124,"line":125},[122,1045,317],{"class":264},[122,1047,320],{"class":128},[122,1049,323],{"class":264},[122,1051,1052],{"class":124,"line":132},[122,1053,1054],{"class":328},"    // Create object, pass in file and parse\n",[122,1056,1057,1059,1061,1063,1065,1067,1069,1071],{"class":124,"line":149},[122,1058,334],{"class":128},[122,1060,265],{"class":264},[122,1062,340],{"class":339},[122,1064,343],{"class":128},[122,1066,346],{"class":271},[122,1068,349],{"class":128},[122,1070,352],{"class":271},[122,1072,355],{"class":128},[122,1074,1075,1077,1079,1081,1083,1085,1087,1089],{"class":124,"line":158},[122,1076,781],{"class":128},[122,1078,265],{"class":264},[122,1080,340],{"class":339},[122,1082,343],{"class":128},[122,1084,346],{"class":271},[122,1086,349],{"class":128},[122,1088,794],{"class":271},[122,1090,355],{"class":128},[122,1092,1093],{"class":124,"line":169},[122,1094,361],{"emptyLinePlaceholder":360},[122,1096,1097,1099,1101,1103,1105,1107],{"class":124,"line":175},[122,1098,810],{"class":128},[122,1100,265],{"class":264},[122,1102,815],{"class":339},[122,1104,343],{"class":128},[122,1106,820],{"class":271},[122,1108,355],{"class":128},[122,1110,1111,1113,1115,1117,1119,1121,1123,1125,1127],{"class":124,"line":183},[122,1112,389],{"class":128},[122,1114,265],{"class":264},[122,1116,394],{"class":128},[122,1118,397],{"class":339},[122,1120,840],{"class":128},[122,1122,843],{"class":339},[122,1124,846],{"class":128},[122,1126,849],{"class":271},[122,1128,355],{"class":128},[122,1130,1131],{"class":124,"line":200},[122,1132,361],{"emptyLinePlaceholder":360},[122,1134,1135,1137,1139,1141,1143],{"class":124,"line":206},[122,1136,411],{"class":128},[122,1138,265],{"class":264},[122,1140,416],{"class":128},[122,1142,419],{"class":339},[122,1144,422],{"class":128},[122,1146,1147,1150,1152,1154,1156],{"class":124,"line":408},[122,1148,1149],{"class":128},"    head ",[122,1151,265],{"class":264},[122,1153,416],{"class":128},[122,1155,541],{"class":339},[122,1157,422],{"class":128},[122,1159,1160],{"class":124,"line":425},[122,1161,361],{"emptyLinePlaceholder":360},[122,1163,1164,1166,1168,1171],{"class":124,"line":445},[122,1165,453],{"class":339},[122,1167,343],{"class":128},[122,1169,1170],{"class":271},"\"Title: #title#\"",[122,1172,355],{"class":128},[122,1174,1175,1177],{"class":124,"line":450},[122,1176,1005],{"class":339},[122,1178,1179],{"class":128},"(head);\n",[122,1181,1182,1184,1186],{"class":124,"line":461},[122,1183,484],{"class":264},[122,1185,320],{"class":128},[122,1187,323],{"class":264},[10,1189,1190,1191,1193,1194,1197,1198,245],{},"Once we have parsed the HTML source, we can access data like ",[85,1192,419],{}," or everything in the ",[85,1195,1196],{},"\u003Chead>"," element with ",[85,1199,1200],{},"head()",[72,1202,1204],{"id":1203},"getting-metadata-with-selectors","Getting Metadata With Selectors",[10,1206,1207,1208,1211],{},"From the Document object, we can use the ",[85,1209,1210],{},"select()"," method and pass in selector syntax, similar to jQuery, as the parameter to match and retrieve the metadata values.",[102,1213,1215],{"className":308,"code":1214,"language":310,"meta":110,"style":110},"\u003Ccfscript>\n    // Create object, pass in file and parse\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n    JFile = createObject(\"java\", \"java.io.File\");\n\n    fileName = expandPath(\"./path/to/file.html\");\n    document = Jsoup.parse(JFile.init(fileName), \"utf-8\");\n    // Get metadata\n    description = document.select(\"meta[name=description]\").first().attr(\"content\");\n    keywords = document.select(\"meta[name=keywords]\").first().attr(\"content\");\n\n    writeOutput(\"\n        \u003Cp>Description: #description#\u003Cp>\n        \u003Cp>Keywords: #keywords#\u003C/p>\n    \");\n\u003C/cfscript>\n",[85,1216,1217,1225,1229,1247,1265,1269,1283,1303,1308,1343,1373,1377,1385,1390,1395,1401],{"__ignoreMap":110},[122,1218,1219,1221,1223],{"class":124,"line":125},[122,1220,317],{"class":264},[122,1222,320],{"class":128},[122,1224,323],{"class":264},[122,1226,1227],{"class":124,"line":132},[122,1228,1054],{"class":328},[122,1230,1231,1233,1235,1237,1239,1241,1243,1245],{"class":124,"line":149},[122,1232,334],{"class":128},[122,1234,265],{"class":264},[122,1236,340],{"class":339},[122,1238,343],{"class":128},[122,1240,346],{"class":271},[122,1242,349],{"class":128},[122,1244,352],{"class":271},[122,1246,355],{"class":128},[122,1248,1249,1251,1253,1255,1257,1259,1261,1263],{"class":124,"line":158},[122,1250,781],{"class":128},[122,1252,265],{"class":264},[122,1254,340],{"class":339},[122,1256,343],{"class":128},[122,1258,346],{"class":271},[122,1260,349],{"class":128},[122,1262,794],{"class":271},[122,1264,355],{"class":128},[122,1266,1267],{"class":124,"line":169},[122,1268,361],{"emptyLinePlaceholder":360},[122,1270,1271,1273,1275,1277,1279,1281],{"class":124,"line":175},[122,1272,810],{"class":128},[122,1274,265],{"class":264},[122,1276,815],{"class":339},[122,1278,343],{"class":128},[122,1280,820],{"class":271},[122,1282,355],{"class":128},[122,1284,1285,1287,1289,1291,1293,1295,1297,1299,1301],{"class":124,"line":183},[122,1286,389],{"class":128},[122,1288,265],{"class":264},[122,1290,394],{"class":128},[122,1292,397],{"class":339},[122,1294,840],{"class":128},[122,1296,843],{"class":339},[122,1298,846],{"class":128},[122,1300,849],{"class":271},[122,1302,355],{"class":128},[122,1304,1305],{"class":124,"line":200},[122,1306,1307],{"class":328},"    // Get metadata\n",[122,1309,1310,1313,1315,1317,1320,1322,1325,1328,1331,1333,1336,1338,1341],{"class":124,"line":206},[122,1311,1312],{"class":128},"    description ",[122,1314,265],{"class":264},[122,1316,416],{"class":128},[122,1318,1319],{"class":339},"select",[122,1321,343],{"class":128},[122,1323,1324],{"class":271},"\"meta[name=description]\"",[122,1326,1327],{"class":128},").",[122,1329,1330],{"class":339},"first",[122,1332,438],{"class":128},[122,1334,1335],{"class":339},"attr",[122,1337,343],{"class":128},[122,1339,1340],{"class":271},"\"content\"",[122,1342,355],{"class":128},[122,1344,1345,1348,1350,1352,1354,1356,1359,1361,1363,1365,1367,1369,1371],{"class":124,"line":408},[122,1346,1347],{"class":128},"    keywords ",[122,1349,265],{"class":264},[122,1351,416],{"class":128},[122,1353,1319],{"class":339},[122,1355,343],{"class":128},[122,1357,1358],{"class":271},"\"meta[name=keywords]\"",[122,1360,1327],{"class":128},[122,1362,1330],{"class":339},[122,1364,438],{"class":128},[122,1366,1335],{"class":339},[122,1368,343],{"class":128},[122,1370,1340],{"class":271},[122,1372,355],{"class":128},[122,1374,1375],{"class":124,"line":425},[122,1376,361],{"emptyLinePlaceholder":360},[122,1378,1379,1381,1383],{"class":124,"line":445},[122,1380,453],{"class":339},[122,1382,343],{"class":128},[122,1384,458],{"class":271},[122,1386,1387],{"class":124,"line":450},[122,1388,1389],{"class":271},"        \u003Cp>Description: #description#\u003Cp>\n",[122,1391,1392],{"class":124,"line":461},[122,1393,1394],{"class":271},"        \u003Cp>Keywords: #keywords#\u003C/p>\n",[122,1396,1397,1399],{"class":124,"line":467},[122,1398,476],{"class":271},[122,1400,355],{"class":128},[122,1402,1403,1405,1407],{"class":124,"line":473},[122,1404,484],{"class":264},[122,1406,320],{"class":128},[122,1408,323],{"class":264},[10,1410,1411,1412,1417,1418,1421,1422,245],{},"We pass in a selector parameter, to query meta elements, which returns an ",[30,1413,1416],{"href":1414,"rel":1415},"https://jsoup.org/apidocs/org/jsoup/nodes/Element.html",[34],"Elements class object",". Then we can access it's key (attribute) values using ",[85,1419,1420],{},"attr()",". We also have access to various helper methods like ",[85,1423,1424],{},"first(), last(), next() & prev()",[72,1426,1428],{"id":1427},"getting-the-html-source-from-an-external-document","Getting the HTML Source From An External Document",[10,1430,1431,1432,245],{},"We can get the raw HTML source of a Document object by calling a parent method: ",[85,1433,1434],{},"html()",[102,1436,1438],{"className":308,"code":1437,"language":310,"meta":110,"style":110},"\u003Ccfscript>\n    // Create the jsoup object and connect\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n\n    siteAddress = \"https://jsoup.org/\";\n    document = Jsoup.connect(siteAddress).get();\n\n    writeDump(document.html());\n\u003C/cfscript>\n",[85,1439,1440,1448,1453,1471,1475,1485,1501,1505,1516],{"__ignoreMap":110},[122,1441,1442,1444,1446],{"class":124,"line":125},[122,1443,317],{"class":264},[122,1445,320],{"class":128},[122,1447,323],{"class":264},[122,1449,1450],{"class":124,"line":132},[122,1451,1452],{"class":328},"    // Create the jsoup object and connect\n",[122,1454,1455,1457,1459,1461,1463,1465,1467,1469],{"class":124,"line":149},[122,1456,334],{"class":128},[122,1458,265],{"class":264},[122,1460,340],{"class":339},[122,1462,343],{"class":128},[122,1464,346],{"class":271},[122,1466,349],{"class":128},[122,1468,352],{"class":271},[122,1470,355],{"class":128},[122,1472,1473],{"class":124,"line":158},[122,1474,361],{"emptyLinePlaceholder":360},[122,1476,1477,1479,1481,1483],{"class":124,"line":169},[122,1478,956],{"class":128},[122,1480,265],{"class":264},[122,1482,961],{"class":271},[122,1484,379],{"class":128},[122,1486,1487,1489,1491,1493,1495,1497,1499],{"class":124,"line":175},[122,1488,389],{"class":128},[122,1490,265],{"class":264},[122,1492,394],{"class":128},[122,1494,974],{"class":339},[122,1496,977],{"class":128},[122,1498,980],{"class":339},[122,1500,422],{"class":128},[122,1502,1503],{"class":124,"line":183},[122,1504,361],{"emptyLinePlaceholder":360},[122,1506,1507,1509,1512,1514],{"class":124,"line":200},[122,1508,1005],{"class":339},[122,1510,1511],{"class":128},"(document.",[122,1513,509],{"class":339},[122,1515,885],{"class":128},[122,1517,1518,1520,1522],{"class":124,"line":206},[122,1519,484],{"class":264},[122,1521,320],{"class":128},[122,1523,323],{"class":264},[10,1525,1526,1527,1529,1530,1533],{},"The ",[85,1528,1434],{}," method is borrowed from the ",[85,1531,1532],{},"Elements"," class object.",[72,1535,1537],{"id":1536},"get-link-data","Get Link Data",[10,1539,1540],{},"Link attributes and content can be obtained using the same selector methods demonstrated earlier.",[102,1542,1544],{"className":308,"code":1543,"language":310,"meta":110,"style":110},"\u003Ccfscript>\n    // Create object, pass in file and parse\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n    JFile = createObject(\"java\", \"java.io.File\");\n\n    fileName = expandPath(\"./path/to/file.html\");\n    document = Jsoup.parse(JFile.init(fileName), \"utf-8\");\n\n    // Get an array of links\n    links = document.select(\"a[href]\");\n    for (link in links) {\n        writeOutput(\"\n            \u003Cdiv>Link: #link.attr(\"href\")#\u003C/div>\n            \u003Cdiv>Text: #link.text()#\u003C/div>\n        \");\n    }\n\u003C/cfscript>\n",[85,1545,1546,1554,1558,1576,1594,1598,1612,1632,1636,1641,1659,1667,1676,1687,1692,1699,1704],{"__ignoreMap":110},[122,1547,1548,1550,1552],{"class":124,"line":125},[122,1549,317],{"class":264},[122,1551,320],{"class":128},[122,1553,323],{"class":264},[122,1555,1556],{"class":124,"line":132},[122,1557,1054],{"class":328},[122,1559,1560,1562,1564,1566,1568,1570,1572,1574],{"class":124,"line":149},[122,1561,334],{"class":128},[122,1563,265],{"class":264},[122,1565,340],{"class":339},[122,1567,343],{"class":128},[122,1569,346],{"class":271},[122,1571,349],{"class":128},[122,1573,352],{"class":271},[122,1575,355],{"class":128},[122,1577,1578,1580,1582,1584,1586,1588,1590,1592],{"class":124,"line":158},[122,1579,781],{"class":128},[122,1581,265],{"class":264},[122,1583,340],{"class":339},[122,1585,343],{"class":128},[122,1587,346],{"class":271},[122,1589,349],{"class":128},[122,1591,794],{"class":271},[122,1593,355],{"class":128},[122,1595,1596],{"class":124,"line":169},[122,1597,361],{"emptyLinePlaceholder":360},[122,1599,1600,1602,1604,1606,1608,1610],{"class":124,"line":175},[122,1601,810],{"class":128},[122,1603,265],{"class":264},[122,1605,815],{"class":339},[122,1607,343],{"class":128},[122,1609,820],{"class":271},[122,1611,355],{"class":128},[122,1613,1614,1616,1618,1620,1622,1624,1626,1628,1630],{"class":124,"line":183},[122,1615,389],{"class":128},[122,1617,265],{"class":264},[122,1619,394],{"class":128},[122,1621,397],{"class":339},[122,1623,840],{"class":128},[122,1625,843],{"class":339},[122,1627,846],{"class":128},[122,1629,849],{"class":271},[122,1631,355],{"class":128},[122,1633,1634],{"class":124,"line":200},[122,1635,361],{"emptyLinePlaceholder":360},[122,1637,1638],{"class":124,"line":206},[122,1639,1640],{"class":328},"    // Get an array of links\n",[122,1642,1643,1646,1648,1650,1652,1654,1657],{"class":124,"line":408},[122,1644,1645],{"class":128},"    links ",[122,1647,265],{"class":264},[122,1649,416],{"class":128},[122,1651,1319],{"class":339},[122,1653,343],{"class":128},[122,1655,1656],{"class":271},"\"a[href]\"",[122,1658,355],{"class":128},[122,1660,1661,1664],{"class":124,"line":425},[122,1662,1663],{"class":264},"    for",[122,1665,1666],{"class":128}," (link in links) {\n",[122,1668,1669,1672,1674],{"class":124,"line":445},[122,1670,1671],{"class":339},"        writeOutput",[122,1673,343],{"class":128},[122,1675,458],{"class":271},[122,1677,1678,1681,1684],{"class":124,"line":450},[122,1679,1680],{"class":271},"            \u003Cdiv>Link: #link.attr(\"",[122,1682,1683],{"class":128},"href",[122,1685,1686],{"class":271},"\")#\u003C/div>\n",[122,1688,1689],{"class":124,"line":461},[122,1690,1691],{"class":271},"            \u003Cdiv>Text: #link.text()#\u003C/div>\n",[122,1693,1694,1697],{"class":124,"line":467},[122,1695,1696],{"class":271},"        \"",[122,1698,355],{"class":128},[122,1700,1701],{"class":124,"line":473},[122,1702,1703],{"class":128},"    }\n",[122,1705,1706,1708,1710],{"class":124,"line":481},[122,1707,484],{"class":264},[122,1709,320],{"class":128},[122,1711,323],{"class":264},[10,1713,1714,1715,1717,1718,1720,1721,1724,1725,245],{},"In this example, we see how to get the ",[85,1716,1683],{}," value using the ",[85,1719,1420],{}," method selector and also how to obtain the text within the actual ",[85,1722,1723],{},"\u003Ca>"," element by using ",[85,1726,1727],{},"text()",[72,1729,1731],{"id":1730},"getting-form-input-data","Getting Form Input Data",[10,1733,1734,1735,1738,1739,1742],{},"Once we find the ",[85,1736,1737],{},"\u003Cform>"," element in the document, we can use selectors to iterate and grab ",[85,1740,1741],{},"\u003Cinput>"," data.",[10,1744,1745],{},"Consider this HTML...",[102,1747,1749],{"className":507,"code":1748,"language":509,"meta":110,"style":110},"\u003C!doctype html>\n\u003Chtml>\n  \u003Chead>\n    \u003Ctitle>CFML & jsoup Example\u003C/title>\n  \u003C/head>\n  \u003Cbody>\n    \u003Cform id=\"contact\" name=\"contact\" action=\"/\">\n      \u003Clabel>Name:\u003C/label>\n      \u003Cinput name=\"fullname\" value=\"Tony Junkes\" />\n      \u003Clabel>E-Mail:\u003C/label>\n      \u003Cinput name=\"email\" value=\"fake@email.com\" />\n      \u003Clabel>Message:\u003C/label>\n      \u003Ctextarea name=\"message\">Message here...\u003C/textarea>\n    \u003C/form>\n  \u003C/body>\n\u003C/html>\n",[85,1750,1751,1761,1769,1777,1789,1797,1805,1835,1850,1874,1887,1909,1922,1943,1952,1960],{"__ignoreMap":110},[122,1752,1753,1755,1757,1759],{"class":124,"line":125},[122,1754,516],{"class":128},[122,1756,520],{"class":519},[122,1758,523],{"class":339},[122,1760,323],{"class":128},[122,1762,1763,1765,1767],{"class":124,"line":132},[122,1764,317],{"class":128},[122,1766,509],{"class":519},[122,1768,323],{"class":128},[122,1770,1771,1773,1775],{"class":124,"line":149},[122,1772,538],{"class":128},[122,1774,541],{"class":519},[122,1776,323],{"class":128},[122,1778,1779,1781,1783,1785,1787],{"class":124,"line":158},[122,1780,548],{"class":128},[122,1782,419],{"class":519},[122,1784,553],{"class":128},[122,1786,419],{"class":519},[122,1788,323],{"class":128},[122,1790,1791,1793,1795],{"class":124,"line":169},[122,1792,648],{"class":128},[122,1794,541],{"class":519},[122,1796,323],{"class":128},[122,1798,1799,1801,1803],{"class":124,"line":175},[122,1800,538],{"class":128},[122,1802,435],{"class":519},[122,1804,323],{"class":128},[122,1806,1807,1809,1812,1814,1816,1819,1821,1823,1825,1828,1830,1833],{"class":124,"line":183},[122,1808,548],{"class":128},[122,1810,1811],{"class":519},"form",[122,1813,670],{"class":339},[122,1815,265],{"class":128},[122,1817,1818],{"class":271},"\"contact\"",[122,1820,584],{"class":339},[122,1822,265],{"class":128},[122,1824,1818],{"class":271},[122,1826,1827],{"class":339}," action",[122,1829,265],{"class":128},[122,1831,1832],{"class":271},"\"/\"",[122,1834,323],{"class":128},[122,1836,1837,1840,1843,1846,1848],{"class":124,"line":200},[122,1838,1839],{"class":128},"      \u003C",[122,1841,1842],{"class":519},"label",[122,1844,1845],{"class":128},">Name:\u003C/",[122,1847,1842],{"class":519},[122,1849,323],{"class":128},[122,1851,1852,1854,1857,1859,1861,1864,1867,1869,1872],{"class":124,"line":206},[122,1853,1839],{"class":128},[122,1855,1856],{"class":519},"input",[122,1858,584],{"class":339},[122,1860,265],{"class":128},[122,1862,1863],{"class":271},"\"fullname\"",[122,1865,1866],{"class":339}," value",[122,1868,265],{"class":128},[122,1870,1871],{"class":271},"\"Tony Junkes\"",[122,1873,575],{"class":128},[122,1875,1876,1878,1880,1883,1885],{"class":124,"line":408},[122,1877,1839],{"class":128},[122,1879,1842],{"class":519},[122,1881,1882],{"class":128},">E-Mail:\u003C/",[122,1884,1842],{"class":519},[122,1886,323],{"class":128},[122,1888,1889,1891,1893,1895,1897,1900,1902,1904,1907],{"class":124,"line":425},[122,1890,1839],{"class":128},[122,1892,1856],{"class":519},[122,1894,584],{"class":339},[122,1896,265],{"class":128},[122,1898,1899],{"class":271},"\"email\"",[122,1901,1866],{"class":339},[122,1903,265],{"class":128},[122,1905,1906],{"class":271},"\"fake@email.com\"",[122,1908,575],{"class":128},[122,1910,1911,1913,1915,1918,1920],{"class":124,"line":445},[122,1912,1839],{"class":128},[122,1914,1842],{"class":519},[122,1916,1917],{"class":128},">Message:\u003C/",[122,1919,1842],{"class":519},[122,1921,323],{"class":128},[122,1923,1924,1926,1929,1931,1933,1936,1939,1941],{"class":124,"line":450},[122,1925,1839],{"class":128},[122,1927,1928],{"class":519},"textarea",[122,1930,584],{"class":339},[122,1932,265],{"class":128},[122,1934,1935],{"class":271},"\"message\"",[122,1937,1938],{"class":128},">Message here...\u003C/",[122,1940,1928],{"class":519},[122,1942,323],{"class":128},[122,1944,1945,1948,1950],{"class":124,"line":461},[122,1946,1947],{"class":128},"    \u003C/",[122,1949,1811],{"class":519},[122,1951,323],{"class":128},[122,1953,1954,1956,1958],{"class":124,"line":467},[122,1955,648],{"class":128},[122,1957,435],{"class":519},[122,1959,323],{"class":128},[122,1961,1962,1964,1966],{"class":124,"line":473},[122,1963,484],{"class":128},[122,1965,509],{"class":519},[122,1967,323],{"class":128},[10,1969,736],{},[102,1971,1973],{"className":308,"code":1972,"language":310,"meta":110,"style":110},"\u003Ccfscript>\n    // Create object, pass in file and parse\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n    JFile = createObject(\"java\", \"java.io.File\");\n\n    fileName = expandPath(\"./path/to/file.html\");\n    document = Jsoup.parse(JFile.init(fileName), \"utf-8\");\n\n    // Get the form and inputs\n    contactForm = document.getElementById(\"contact\");\n    inputs = contactForm.getElementsByTag(\"input\");\n\n    // Iterate through the inputs\n    for (input in inputs) {\n        key = input.attr(\"name\");\n        value = input.attr(\"value\");\n        writeOutput(\"\n            \u003Cdiv>Name: #key#\u003C/div>\n            \u003Cdiv>Value: #value#\u003C/div>\n        \");\n    }\n\u003C/cfscript>\n",[85,1974,1975,1983,1987,2005,2023,2027,2041,2061,2065,2070,2087,2107,2111,2116,2123,2142,2160,2168,2174,2180,2187,2192],{"__ignoreMap":110},[122,1976,1977,1979,1981],{"class":124,"line":125},[122,1978,317],{"class":264},[122,1980,320],{"class":128},[122,1982,323],{"class":264},[122,1984,1985],{"class":124,"line":132},[122,1986,1054],{"class":328},[122,1988,1989,1991,1993,1995,1997,1999,2001,2003],{"class":124,"line":149},[122,1990,334],{"class":128},[122,1992,265],{"class":264},[122,1994,340],{"class":339},[122,1996,343],{"class":128},[122,1998,346],{"class":271},[122,2000,349],{"class":128},[122,2002,352],{"class":271},[122,2004,355],{"class":128},[122,2006,2007,2009,2011,2013,2015,2017,2019,2021],{"class":124,"line":158},[122,2008,781],{"class":128},[122,2010,265],{"class":264},[122,2012,340],{"class":339},[122,2014,343],{"class":128},[122,2016,346],{"class":271},[122,2018,349],{"class":128},[122,2020,794],{"class":271},[122,2022,355],{"class":128},[122,2024,2025],{"class":124,"line":169},[122,2026,361],{"emptyLinePlaceholder":360},[122,2028,2029,2031,2033,2035,2037,2039],{"class":124,"line":175},[122,2030,810],{"class":128},[122,2032,265],{"class":264},[122,2034,815],{"class":339},[122,2036,343],{"class":128},[122,2038,820],{"class":271},[122,2040,355],{"class":128},[122,2042,2043,2045,2047,2049,2051,2053,2055,2057,2059],{"class":124,"line":183},[122,2044,389],{"class":128},[122,2046,265],{"class":264},[122,2048,394],{"class":128},[122,2050,397],{"class":339},[122,2052,840],{"class":128},[122,2054,843],{"class":339},[122,2056,846],{"class":128},[122,2058,849],{"class":271},[122,2060,355],{"class":128},[122,2062,2063],{"class":124,"line":200},[122,2064,361],{"emptyLinePlaceholder":360},[122,2066,2067],{"class":124,"line":206},[122,2068,2069],{"class":328},"    // Get the form and inputs\n",[122,2071,2072,2075,2077,2079,2081,2083,2085],{"class":124,"line":408},[122,2073,2074],{"class":128},"    contactForm ",[122,2076,265],{"class":264},[122,2078,416],{"class":128},[122,2080,863],{"class":339},[122,2082,343],{"class":128},[122,2084,1818],{"class":271},[122,2086,355],{"class":128},[122,2088,2089,2092,2094,2097,2100,2102,2105],{"class":124,"line":425},[122,2090,2091],{"class":128},"    inputs ",[122,2093,265],{"class":264},[122,2095,2096],{"class":128}," contactForm.",[122,2098,2099],{"class":339},"getElementsByTag",[122,2101,343],{"class":128},[122,2103,2104],{"class":271},"\"input\"",[122,2106,355],{"class":128},[122,2108,2109],{"class":124,"line":445},[122,2110,361],{"emptyLinePlaceholder":360},[122,2112,2113],{"class":124,"line":450},[122,2114,2115],{"class":328},"    // Iterate through the inputs\n",[122,2117,2118,2120],{"class":124,"line":461},[122,2119,1663],{"class":264},[122,2121,2122],{"class":128}," (input in inputs) {\n",[122,2124,2125,2128,2130,2133,2135,2137,2140],{"class":124,"line":467},[122,2126,2127],{"class":128},"        key ",[122,2129,265],{"class":264},[122,2131,2132],{"class":128}," input.",[122,2134,1335],{"class":339},[122,2136,343],{"class":128},[122,2138,2139],{"class":271},"\"name\"",[122,2141,355],{"class":128},[122,2143,2144,2147,2149,2151,2153,2155,2158],{"class":124,"line":473},[122,2145,2146],{"class":128},"        value ",[122,2148,265],{"class":264},[122,2150,2132],{"class":128},[122,2152,1335],{"class":339},[122,2154,343],{"class":128},[122,2156,2157],{"class":271},"\"value\"",[122,2159,355],{"class":128},[122,2161,2162,2164,2166],{"class":124,"line":481},[122,2163,1671],{"class":339},[122,2165,343],{"class":128},[122,2167,458],{"class":271},[122,2169,2171],{"class":124,"line":2170},18,[122,2172,2173],{"class":271},"            \u003Cdiv>Name: #key#\u003C/div>\n",[122,2175,2177],{"class":124,"line":2176},19,[122,2178,2179],{"class":271},"            \u003Cdiv>Value: #value#\u003C/div>\n",[122,2181,2183,2185],{"class":124,"line":2182},20,[122,2184,1696],{"class":271},[122,2186,355],{"class":128},[122,2188,2190],{"class":124,"line":2189},21,[122,2191,1703],{"class":128},[122,2193,2195,2197,2199],{"class":124,"line":2194},22,[122,2196,484],{"class":264},[122,2198,320],{"class":128},[122,2200,323],{"class":264},[10,2202,2203,2204,2207,2208,2210,2211,2214,2215,2217],{},"So we've used ",[85,2205,2206],{},"getElementById()"," to find the ",[85,2209,1737],{}," element and then ",[85,2212,2213],{},"getElementsByTag()"," to grab all of the ",[85,2216,1741],{}," elements within the form. At this point, we can iterate through the array of inputs and use selector methods to act on the data.",[19,2219],{},[295,2221,2223],{"id":2222},"sanitizing-html-content-with-jsoup","Sanitizing HTML Content With jsoup",[10,2225,2226,2227,2232,2233,2238],{},"jsoup provides a collection of classes and methods for sanitizing HTML. Similar to Antisamy, you can use a premade or custom ",[30,2228,2231],{"href":2229,"rel":2230},"https://jsoup.org/apidocs/org/jsoup/safety/Whitelist.html",[34],"Whitelist class object"," that specifies valid and invalid elements in a document. This whitelist object is then passed to a ",[30,2234,2237],{"href":2235,"rel":2236},"https://jsoup.org/apidocs/org/jsoup/safety/Cleaner.html",[34],"Cleaner class object"," which checks the document against the whitelist rules and removes any invalid content.",[102,2240,2242],{"className":308,"code":2241,"language":310,"meta":110,"style":110},"\u003Ccfscript>\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n    Whitelist = createObject(\"java\", \"org.jsoup.safety.Whitelist\");\n    Cleaner = createObject(\"java\", \"org.jsoup.safety.Cleaner\");\n\n    html = \"\u003Chtml>\u003Chead>\u003Ctitle>My title\u003C/title>\u003C/head>\u003Cbody>\u003Ccenter>Body content\u003C/center>\u003C/body>\u003C/html>\";\n    filter = Whitelist.none();\n    valid = Jsoup.isValid(html, filter);\n\n    if (valid) {\n        writeOutput(\"The document is valid!\");\n    } else {\n        invalidData = Jsoup.parse(html);\n        writeOutput(\"The document is not valid!\");\n        writeDump(invalidData.html());\n        cleanDocument = Cleaner.init(filter).clean(invalidData);\n        writeOutput(\"The document has been cleaned.\");\n        writeDump(cleanDocument.html());\n    }\n\u003C/cfscript>\n",[85,2243,2244,2252,2270,2290,2310,2314,2325,2340,2355,2359,2367,2378,2389,2402,2413,2425,2446,2457,2468,2472],{"__ignoreMap":110},[122,2245,2246,2248,2250],{"class":124,"line":125},[122,2247,317],{"class":264},[122,2249,320],{"class":128},[122,2251,323],{"class":264},[122,2253,2254,2256,2258,2260,2262,2264,2266,2268],{"class":124,"line":132},[122,2255,334],{"class":128},[122,2257,265],{"class":264},[122,2259,340],{"class":339},[122,2261,343],{"class":128},[122,2263,346],{"class":271},[122,2265,349],{"class":128},[122,2267,352],{"class":271},[122,2269,355],{"class":128},[122,2271,2272,2275,2277,2279,2281,2283,2285,2288],{"class":124,"line":149},[122,2273,2274],{"class":128},"    Whitelist ",[122,2276,265],{"class":264},[122,2278,340],{"class":339},[122,2280,343],{"class":128},[122,2282,346],{"class":271},[122,2284,349],{"class":128},[122,2286,2287],{"class":271},"\"org.jsoup.safety.Whitelist\"",[122,2289,355],{"class":128},[122,2291,2292,2295,2297,2299,2301,2303,2305,2308],{"class":124,"line":158},[122,2293,2294],{"class":128},"    Cleaner ",[122,2296,265],{"class":264},[122,2298,340],{"class":339},[122,2300,343],{"class":128},[122,2302,346],{"class":271},[122,2304,349],{"class":128},[122,2306,2307],{"class":271},"\"org.jsoup.safety.Cleaner\"",[122,2309,355],{"class":128},[122,2311,2312],{"class":124,"line":169},[122,2313,361],{"emptyLinePlaceholder":360},[122,2315,2316,2318,2320,2323],{"class":124,"line":175},[122,2317,371],{"class":128},[122,2319,265],{"class":264},[122,2321,2322],{"class":271}," \"\u003Chtml>\u003Chead>\u003Ctitle>My title\u003C/title>\u003C/head>\u003Cbody>\u003Ccenter>Body content\u003C/center>\u003C/body>\u003C/html>\"",[122,2324,379],{"class":128},[122,2326,2327,2330,2332,2335,2338],{"class":124,"line":183},[122,2328,2329],{"class":128},"    filter ",[122,2331,265],{"class":264},[122,2333,2334],{"class":128}," Whitelist.",[122,2336,2337],{"class":339},"none",[122,2339,422],{"class":128},[122,2341,2342,2345,2347,2349,2352],{"class":124,"line":200},[122,2343,2344],{"class":128},"    valid ",[122,2346,265],{"class":264},[122,2348,394],{"class":128},[122,2350,2351],{"class":339},"isValid",[122,2353,2354],{"class":128},"(html, filter);\n",[122,2356,2357],{"class":124,"line":206},[122,2358,361],{"emptyLinePlaceholder":360},[122,2360,2361,2364],{"class":124,"line":408},[122,2362,2363],{"class":264},"    if",[122,2365,2366],{"class":128}," (valid) {\n",[122,2368,2369,2371,2373,2376],{"class":124,"line":425},[122,2370,1671],{"class":339},[122,2372,343],{"class":128},[122,2374,2375],{"class":271},"\"The document is valid!\"",[122,2377,355],{"class":128},[122,2379,2380,2383,2386],{"class":124,"line":445},[122,2381,2382],{"class":128},"    } ",[122,2384,2385],{"class":264},"else",[122,2387,2388],{"class":128}," {\n",[122,2390,2391,2394,2396,2398,2400],{"class":124,"line":450},[122,2392,2393],{"class":128},"        invalidData ",[122,2395,265],{"class":264},[122,2397,394],{"class":128},[122,2399,397],{"class":339},[122,2401,400],{"class":128},[122,2403,2404,2406,2408,2411],{"class":124,"line":461},[122,2405,1671],{"class":339},[122,2407,343],{"class":128},[122,2409,2410],{"class":271},"\"The document is not valid!\"",[122,2412,355],{"class":128},[122,2414,2415,2418,2421,2423],{"class":124,"line":467},[122,2416,2417],{"class":339},"        writeDump",[122,2419,2420],{"class":128},"(invalidData.",[122,2422,509],{"class":339},[122,2424,885],{"class":128},[122,2426,2427,2430,2432,2435,2437,2440,2443],{"class":124,"line":473},[122,2428,2429],{"class":128},"        cleanDocument ",[122,2431,265],{"class":264},[122,2433,2434],{"class":128}," Cleaner.",[122,2436,843],{"class":339},[122,2438,2439],{"class":128},"(filter).",[122,2441,2442],{"class":339},"clean",[122,2444,2445],{"class":128},"(invalidData);\n",[122,2447,2448,2450,2452,2455],{"class":124,"line":481},[122,2449,1671],{"class":339},[122,2451,343],{"class":128},[122,2453,2454],{"class":271},"\"The document has been cleaned.\"",[122,2456,355],{"class":128},[122,2458,2459,2461,2464,2466],{"class":124,"line":2170},[122,2460,2417],{"class":339},[122,2462,2463],{"class":128},"(cleanDocument.",[122,2465,509],{"class":339},[122,2467,885],{"class":128},[122,2469,2470],{"class":124,"line":2176},[122,2471,1703],{"class":128},[122,2473,2474,2476,2478],{"class":124,"line":2182},[122,2475,484],{"class":264},[122,2477,320],{"class":128},[122,2479,323],{"class":264},[10,2481,2482,2483,2486,2487,2490,2491,2494,2495,2498,2499,2502],{},"This example takes simple HTML content and passes it to a ",[85,2484,2485],{},"Whitelist"," that calls the ",[85,2488,2489],{},"none()"," method. This is a pre-defined Whitelist that restricts any HTML markup inside of the ",[85,2492,2493],{},"\u003Cbody>",". When the populated class is passed to the ",[85,2496,2497],{},"Cleaner",", the ",[85,2500,2501],{},"clean()"," method is called to remove any HTML and leave only valid HTML.",[10,2504,2505],{},"A list of default options includes:",[44,2507,2508,2514,2521,2528,2535],{},[47,2509,2510],{},[30,2511,2489],{"href":2512,"rel":2513},"https://jsoup.org/apidocs/org/jsoup/safety/Whitelist.html#none--",[34],[47,2515,2516],{},[30,2517,2520],{"href":2518,"rel":2519},"https://jsoup.org/apidocs/org/jsoup/safety/Whitelist.html#simpleText--",[34],"simpleText()",[47,2522,2523],{},[30,2524,2527],{"href":2525,"rel":2526},"https://jsoup.org/apidocs/org/jsoup/safety/Whitelist.html#basic--",[34],"basic()",[47,2529,2530],{},[30,2531,2534],{"href":2532,"rel":2533},"https://jsoup.org/apidocs/org/jsoup/safety/Whitelist.html#basicWithImages--",[34],"basicWithImages()",[47,2536,2537],{},[30,2538,2541],{"href":2539,"rel":2540},"https://jsoup.org/apidocs/org/jsoup/safety/Whitelist.html#relaxed--",[34],"relaxed()",[19,2543],{},[295,2545,2547],{"id":2546},"extra-credit-a-few-in-depth-techniques","Extra Credit - A Few In Depth Techniques",[10,2549,2550],{},"Here's a few more examples I thought were worth mentioning because jsoup is so cool.",[72,2552,2554],{"id":2553},"extract-replace-an-html-element-with-its-content","Extract & Replace An HTML Element With It's Content",[10,2556,2557,2558,2560,2561,245],{},"This example gets the inner content of an ",[85,2559,1723],{}," element and replaces the element with only the content; using a ",[30,2562,2565],{"href":2563,"rel":2564},"https://jsoup.org/apidocs/org/jsoup/nodes/TextNode.html",[34],"TextNode class object",[102,2567,2569],{"className":308,"code":2568,"language":310,"meta":110,"style":110},"\u003Ccfscript>\n    // Create Java objects\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n    TextNode = createObject(\"java\", \"org.jsoup.nodes.TextNode\");\n\n    // Create some markup...\n    html = '\u003Chtml>\u003Chead>\u003Ctitle>Hello World!\u003C/title>\u003C/head>\u003Cbody>\u003Ch1>A Header\u003C/h1>\u003Cp>Some content. \u003Ca href=\"##\">A cool link.\u003C/a>\u003C/p>\u003C/body>\u003C/html>';\n    // Parse it into a Jsoup Document\n    document = Jsoup.parse(html);\n\n    // Create a Node object\n    link = document.select(\"a\").first();\n    node = TextNode.init(link.text(), \"\");\n    link.replaceWith(node);\n\n    writeDump(label=\"Original HTML\", var=\"#html#\");\n    writeDump(label=\"Link Text\", var=\"#link.text()#\");\n    writeDump(label=\"Modified HTML\", var=\"#document.body().toString()#\");\n    writeDump(node);\n\u003C/cfscript>\n",[85,2570,2571,2579,2584,2602,2622,2626,2631,2642,2647,2659,2663,2668,2690,2715,2726,2730,2752,2772,2792,2798],{"__ignoreMap":110},[122,2572,2573,2575,2577],{"class":124,"line":125},[122,2574,317],{"class":264},[122,2576,320],{"class":128},[122,2578,323],{"class":264},[122,2580,2581],{"class":124,"line":132},[122,2582,2583],{"class":328},"    // Create Java objects\n",[122,2585,2586,2588,2590,2592,2594,2596,2598,2600],{"class":124,"line":149},[122,2587,334],{"class":128},[122,2589,265],{"class":264},[122,2591,340],{"class":339},[122,2593,343],{"class":128},[122,2595,346],{"class":271},[122,2597,349],{"class":128},[122,2599,352],{"class":271},[122,2601,355],{"class":128},[122,2603,2604,2607,2609,2611,2613,2615,2617,2620],{"class":124,"line":158},[122,2605,2606],{"class":128},"    TextNode ",[122,2608,265],{"class":264},[122,2610,340],{"class":339},[122,2612,343],{"class":128},[122,2614,346],{"class":271},[122,2616,349],{"class":128},[122,2618,2619],{"class":271},"\"org.jsoup.nodes.TextNode\"",[122,2621,355],{"class":128},[122,2623,2624],{"class":124,"line":169},[122,2625,361],{"emptyLinePlaceholder":360},[122,2627,2628],{"class":124,"line":175},[122,2629,2630],{"class":328},"    // Create some markup...\n",[122,2632,2633,2635,2637,2640],{"class":124,"line":183},[122,2634,371],{"class":128},[122,2636,265],{"class":264},[122,2638,2639],{"class":271}," '\u003Chtml>\u003Chead>\u003Ctitle>Hello World!\u003C/title>\u003C/head>\u003Cbody>\u003Ch1>A Header\u003C/h1>\u003Cp>Some content. \u003Ca href=\"##\">A cool link.\u003C/a>\u003C/p>\u003C/body>\u003C/html>'",[122,2641,379],{"class":128},[122,2643,2644],{"class":124,"line":200},[122,2645,2646],{"class":328},"    // Parse it into a Jsoup Document\n",[122,2648,2649,2651,2653,2655,2657],{"class":124,"line":206},[122,2650,389],{"class":128},[122,2652,265],{"class":264},[122,2654,394],{"class":128},[122,2656,397],{"class":339},[122,2658,400],{"class":128},[122,2660,2661],{"class":124,"line":408},[122,2662,361],{"emptyLinePlaceholder":360},[122,2664,2665],{"class":124,"line":425},[122,2666,2667],{"class":328},"    // Create a Node object\n",[122,2669,2670,2673,2675,2677,2679,2681,2684,2686,2688],{"class":124,"line":445},[122,2671,2672],{"class":128},"    link ",[122,2674,265],{"class":264},[122,2676,416],{"class":128},[122,2678,1319],{"class":339},[122,2680,343],{"class":128},[122,2682,2683],{"class":271},"\"a\"",[122,2685,1327],{"class":128},[122,2687,1330],{"class":339},[122,2689,422],{"class":128},[122,2691,2692,2695,2697,2700,2702,2705,2707,2710,2713],{"class":124,"line":450},[122,2693,2694],{"class":128},"    node ",[122,2696,265],{"class":264},[122,2698,2699],{"class":128}," TextNode.",[122,2701,843],{"class":339},[122,2703,2704],{"class":128},"(link.",[122,2706,107],{"class":339},[122,2708,2709],{"class":128},"(), ",[122,2711,2712],{"class":271},"\"\"",[122,2714,355],{"class":128},[122,2716,2717,2720,2723],{"class":124,"line":461},[122,2718,2719],{"class":128},"    link.",[122,2721,2722],{"class":339},"replaceWith",[122,2724,2725],{"class":128},"(node);\n",[122,2727,2728],{"class":124,"line":467},[122,2729,361],{"emptyLinePlaceholder":360},[122,2731,2732,2734,2737,2739,2742,2745,2747,2750],{"class":124,"line":473},[122,2733,1005],{"class":339},[122,2735,2736],{"class":128},"(label",[122,2738,265],{"class":264},[122,2740,2741],{"class":271},"\"Original HTML\"",[122,2743,2744],{"class":128},", var",[122,2746,265],{"class":264},[122,2748,2749],{"class":271},"\"#html#\"",[122,2751,355],{"class":128},[122,2753,2754,2756,2758,2760,2763,2765,2767,2770],{"class":124,"line":481},[122,2755,1005],{"class":339},[122,2757,2736],{"class":128},[122,2759,265],{"class":264},[122,2761,2762],{"class":271},"\"Link Text\"",[122,2764,2744],{"class":128},[122,2766,265],{"class":264},[122,2768,2769],{"class":271},"\"#link.text()#\"",[122,2771,355],{"class":128},[122,2773,2774,2776,2778,2780,2783,2785,2787,2790],{"class":124,"line":2170},[122,2775,1005],{"class":339},[122,2777,2736],{"class":128},[122,2779,265],{"class":264},[122,2781,2782],{"class":271},"\"Modified HTML\"",[122,2784,2744],{"class":128},[122,2786,265],{"class":264},[122,2788,2789],{"class":271},"\"#document.body().toString()#\"",[122,2791,355],{"class":128},[122,2793,2794,2796],{"class":124,"line":2176},[122,2795,1005],{"class":339},[122,2797,2725],{"class":128},[122,2799,2800,2802,2804],{"class":124,"line":2182},[122,2801,484],{"class":264},[122,2803,320],{"class":128},[122,2805,323],{"class":264},[10,2807,2808,2809,2812,2813,2815,2816,2819],{},"Using a ",[85,2810,2811],{},"TextNode",", we can store the content between the ",[85,2814,1723],{}," element. Then call the ",[85,2817,2818],{},"replaceWith()"," method on the element to switch out the HTML for plain text.",[72,2821,2823],{"id":2822},"filtering-selected-results-with-regex","Filtering Selected Results With Regex",[10,2825,2826,2827,2829,2830,2833,2834,245],{},"Using the same ",[85,2828,1210],{}," method, we can pass in a regular expression string to filter results by using ",[85,2831,2832],{},"~="," instead of ",[85,2835,265],{},[102,2837,2839],{"className":308,"code":2838,"language":310,"meta":110,"style":110},"\u003Ccfscript>\n    // Create Java objects\n    Jsoup = createObject(\"java\", \"org.jsoup.Jsoup\");\n\n    siteAddress = \"https://jsoup.org/\";\n    document = Jsoup.connect(siteAddress).get();\n    links = document.select(\"a[href~=^((?!##|html).)*$]\");\n\n    original = [];\n    for (link in document.select(\"a[href]\")) {\n        original.append(link.attr(\"href\"));\n    }\n\n    filtered = [];\n    for (link in links) {\n        filtered.append(link.attr(\"href\"));\n    }\n\n    // Original links\n    writeDump(label=\"Original Links\", var=\"#original#\");\n    // Filtered links\n    writeDump(label=\"Filtered Links\", var=\"#filtered#\");\n\u003C/cfscript>\n",[85,2840,2841,2849,2853,2871,2875,2885,2901,2918,2922,2932,2948,2968,2972,2976,2985,2991,3008,3012,3016,3021,3041,3046,3066],{"__ignoreMap":110},[122,2842,2843,2845,2847],{"class":124,"line":125},[122,2844,317],{"class":264},[122,2846,320],{"class":128},[122,2848,323],{"class":264},[122,2850,2851],{"class":124,"line":132},[122,2852,2583],{"class":328},[122,2854,2855,2857,2859,2861,2863,2865,2867,2869],{"class":124,"line":149},[122,2856,334],{"class":128},[122,2858,265],{"class":264},[122,2860,340],{"class":339},[122,2862,343],{"class":128},[122,2864,346],{"class":271},[122,2866,349],{"class":128},[122,2868,352],{"class":271},[122,2870,355],{"class":128},[122,2872,2873],{"class":124,"line":158},[122,2874,361],{"emptyLinePlaceholder":360},[122,2876,2877,2879,2881,2883],{"class":124,"line":169},[122,2878,956],{"class":128},[122,2880,265],{"class":264},[122,2882,961],{"class":271},[122,2884,379],{"class":128},[122,2886,2887,2889,2891,2893,2895,2897,2899],{"class":124,"line":175},[122,2888,389],{"class":128},[122,2890,265],{"class":264},[122,2892,394],{"class":128},[122,2894,974],{"class":339},[122,2896,977],{"class":128},[122,2898,980],{"class":339},[122,2900,422],{"class":128},[122,2902,2903,2905,2907,2909,2911,2913,2916],{"class":124,"line":183},[122,2904,1645],{"class":128},[122,2906,265],{"class":264},[122,2908,416],{"class":128},[122,2910,1319],{"class":339},[122,2912,343],{"class":128},[122,2914,2915],{"class":271},"\"a[href~=^((?!##|html).)*$]\"",[122,2917,355],{"class":128},[122,2919,2920],{"class":124,"line":200},[122,2921,361],{"emptyLinePlaceholder":360},[122,2923,2924,2927,2929],{"class":124,"line":206},[122,2925,2926],{"class":128},"    original ",[122,2928,265],{"class":264},[122,2930,2931],{"class":128}," [];\n",[122,2933,2934,2936,2939,2941,2943,2945],{"class":124,"line":408},[122,2935,1663],{"class":264},[122,2937,2938],{"class":128}," (link in document.",[122,2940,1319],{"class":339},[122,2942,343],{"class":128},[122,2944,1656],{"class":271},[122,2946,2947],{"class":128},")) {\n",[122,2949,2950,2953,2956,2958,2960,2962,2965],{"class":124,"line":425},[122,2951,2952],{"class":128},"        original.",[122,2954,2955],{"class":339},"append",[122,2957,2704],{"class":128},[122,2959,1335],{"class":339},[122,2961,343],{"class":128},[122,2963,2964],{"class":271},"\"href\"",[122,2966,2967],{"class":128},"));\n",[122,2969,2970],{"class":124,"line":445},[122,2971,1703],{"class":128},[122,2973,2974],{"class":124,"line":450},[122,2975,361],{"emptyLinePlaceholder":360},[122,2977,2978,2981,2983],{"class":124,"line":461},[122,2979,2980],{"class":128},"    filtered ",[122,2982,265],{"class":264},[122,2984,2931],{"class":128},[122,2986,2987,2989],{"class":124,"line":467},[122,2988,1663],{"class":264},[122,2990,1666],{"class":128},[122,2992,2993,2996,2998,3000,3002,3004,3006],{"class":124,"line":473},[122,2994,2995],{"class":128},"        filtered.",[122,2997,2955],{"class":339},[122,2999,2704],{"class":128},[122,3001,1335],{"class":339},[122,3003,343],{"class":128},[122,3005,2964],{"class":271},[122,3007,2967],{"class":128},[122,3009,3010],{"class":124,"line":481},[122,3011,1703],{"class":128},[122,3013,3014],{"class":124,"line":2170},[122,3015,361],{"emptyLinePlaceholder":360},[122,3017,3018],{"class":124,"line":2176},[122,3019,3020],{"class":328},"    // Original links\n",[122,3022,3023,3025,3027,3029,3032,3034,3036,3039],{"class":124,"line":2182},[122,3024,1005],{"class":339},[122,3026,2736],{"class":128},[122,3028,265],{"class":264},[122,3030,3031],{"class":271},"\"Original Links\"",[122,3033,2744],{"class":128},[122,3035,265],{"class":264},[122,3037,3038],{"class":271},"\"#original#\"",[122,3040,355],{"class":128},[122,3042,3043],{"class":124,"line":2189},[122,3044,3045],{"class":328},"    // Filtered links\n",[122,3047,3048,3050,3052,3054,3057,3059,3061,3064],{"class":124,"line":2194},[122,3049,1005],{"class":339},[122,3051,2736],{"class":128},[122,3053,265],{"class":264},[122,3055,3056],{"class":271},"\"Filtered Links\"",[122,3058,2744],{"class":128},[122,3060,265],{"class":264},[122,3062,3063],{"class":271},"\"#filtered#\"",[122,3065,355],{"class":128},[122,3067,3069,3071,3073],{"class":124,"line":3068},23,[122,3070,484],{"class":264},[122,3072,320],{"class":128},[122,3074,323],{"class":264},[10,3076,3077,3078,3081,3082,245],{},"In this example, we grab links from the first page of the Jsoup site. The filtered links use a regex to exclude any URLs that contain a ",[85,3079,3080],{},"#"," or the string ",[85,3083,509],{},[14,3085,3087],{"id":3086},"wrapping-up","Wrapping Up",[19,3089],{},[10,3091,3092,3094],{},[85,3093,35],{}," is a super powerful framework for working with and manipulating HTML. The possibilities are endless when working with node structured documents.",[10,3096,3097,3098,245],{},"For more info on it's classes and methods, check out the ",[30,3099,3102],{"href":3100,"rel":3101},"https://jsoup.org/apidocs/",[34],"jsoup API Docs",[10,3104,3105,3106,245],{},"To help break into these examples, I've put together a little project that can be run from CommandBox. You can find the it at ",[30,3107,3110],{"href":3108,"rel":3109},"https://github.com/tonyjunkes/cfml-jsoup-example",[34],"GitHub - cfml-jsoup-example",[10,3112,3113],{},"Cheers!",[3115,3116,3117],"style",{},"html pre.shiki code .s__4a, html code.shiki .s__4a{--shiki-light:#24292E;--shiki-default:#E1E4E8;--shiki-dark:#E1E4E8;--shiki-sepia:#F8F8F2}html pre.shiki code .sHysA, html code.shiki .sHysA{--shiki-light:#005CC5;--shiki-light-font-style:inherit;--shiki-default:#79B8FF;--shiki-default-font-style:inherit;--shiki-dark:#79B8FF;--shiki-dark-font-style:inherit;--shiki-sepia:#66D9EF;--shiki-sepia-font-style:italic}html pre.shiki code .sSnWl, html code.shiki .sSnWl{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF;--shiki-sepia:#CFCFC2}html pre.shiki code .s-7EV, html code.shiki .s-7EV{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF;--shiki-sepia:#AE81FF}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html .sepia .shiki span {color: var(--shiki-sepia);background: var(--shiki-sepia-bg);font-style: var(--shiki-sepia-font-style);font-weight: var(--shiki-sepia-font-weight);text-decoration: var(--shiki-sepia-text-decoration);}html.sepia .shiki span {color: var(--shiki-sepia);background: var(--shiki-sepia-bg);font-style: var(--shiki-sepia-font-style);font-weight: var(--shiki-sepia-font-weight);text-decoration: var(--shiki-sepia-text-decoration);}html pre.shiki code .s0XdW, html code.shiki .s0XdW{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF;--shiki-sepia:#FD971F}html pre.shiki code .sOw68, html code.shiki .sOw68{--shiki-light:#D73A49;--shiki-default:#F97583;--shiki-dark:#F97583;--shiki-sepia:#F92672}html pre.shiki code .svw4x, html code.shiki .svw4x{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF;--shiki-sepia:#E6DB74}html pre.shiki code .sq-8i, html code.shiki .sq-8i{--shiki-light:#6A737D;--shiki-default:#6A737D;--shiki-dark:#6A737D;--shiki-sepia:#88846F}html pre.shiki code .s9ihy, html code.shiki .s9ihy{--shiki-light:#6F42C1;--shiki-default:#B392F0;--shiki-dark:#B392F0;--shiki-sepia:#A6E22E}html pre.shiki code .sIn_X, html code.shiki .sIn_X{--shiki-light:#22863A;--shiki-default:#85E89D;--shiki-dark:#85E89D;--shiki-sepia:#F92672}",{"title":110,"searchDepth":132,"depth":132,"links":3119},[3120,3121,3122,3128],{"id":16,"depth":132,"text":17},{"id":64,"depth":132,"text":65},{"id":290,"depth":132,"text":291,"children":3123},[3124,3125,3126,3127],{"id":297,"depth":149,"text":298},{"id":1028,"depth":149,"text":1029},{"id":2222,"depth":149,"text":2223},{"id":2546,"depth":149,"text":2547},{"id":3086,"depth":132,"text":3087},"2017-08-18","md",{"aliases":3132},[3133,3134],"/blog/crah-course-into-cfml-and-jsoup/","/blog/crash-course-into-cfml-and-jsoup/","/blog/2017-08-18-crash-course-in-cfml-and-jsoup",{"title":5,"description":12},"crash-course-in-cfml-and-jsoup","blog/2017-08-18-crash-course-in-cfml-and-jsoup",[3140,35,3141],"CFML","Java","rNc4hGYm9bjJWSY1aF6gT88BUnQDAh7ODDsbRk2BEQ4"]