java - Why does Transformer return < and > instead of < and >? -
trnsformer.transform(domsource, streamresult);
input in domsource contains many <br>
tags, instead >
, <
instead of <
, >
<br>
return < br >
i know < >
equivalent <>
. how can make transformer class change encoding , return <br>
instead ?
xml creator
public class creatxml { public static void main(string[] args){ try { file article = new file("article.txt"); scanner scan = new scanner (article); stringbuilder str = new stringbuilder(); while (scan.hasnext()) { str.append(scan.nextline()); str.append("<br>"); } documentbuilderfactory factory = documentbuilderfactory.newinstance(); documentbuilder builder = factory.newdocumentbuilder(); document doc = builder.newdocument(); element body = doc.createelement("div"); doc.appendchild(body); attr classattr = doc.createattribute("class"); classattr.setvalue("code"); body.setattributenode(classattr); element p = doc.createelement("p"); p.appendchild(doc.createtextnode(str.tostring())); body.appendchild(p); transformerfactory transfatory = transformerfactory.newinstance(); transformer transformer = transfatory.newtransformer(); domsource dom = new domsource(doc); stringwriter writer = new stringwriter(); streamresult result = new streamresult(writer); transformer.transform(dom, result); system.out.println(writer.tostring()); }catch (exception e){e.printstacktrace();} } }
input sample
<br>
this input sample<br>
output
<?xml [stuff] ><div><p><br>
this input sample<br><br></p></div>
the problem lies here:
p.appendchild(doc.createtextnode(str.tostring()));
you don't have <br>
elements in document. have single <p>
element textual content contains occurrences of 4 characters <
, b
, r
, , >
. in accordance well-formed xml, characters being encoded in manner you're seeing.
in other words, createtextnode
not create xml elements.
instead of stringbuilder, you'll need create separate text nodes , element nodes:
while (scan.hasnext()) { p.appendchild(doc.createtextnode(scan.nextline())); p.appendchild(doc.createelement("br")); }
Comments
Post a Comment