.Net WebClient DownloadString screwed up my Unicode non english characters

For the past several days, I was trying to build a small utility tool to copy a Tumblr blog in the same account. Some of my posts contains unicode characters, and instead of getting مركز ميركاتو instead I'm getting Ù’¦Ã˜±Ã™Æ’ز Ù’¦Ã™Å Ã˜±Ã™Æ’اتÙˆ. Probably not much difference to you and I, but for arabic readers, the two made a lot of difference Smile

Originally I had something like the following:

string data = client.DownloadString("[some url]");
var reader = new StringReader(data);

Pretty straight forward right? But the thing is, it doesn't work. So I found out the hard way, client.DownloadString doesn't encode the characters using UTF-8.

To do that, I had to change the code to the following:

var data = client.DownloadData("[some url]");
var strungData = Encoding.UTF8.GetString(data);
var reader = new StringReader(strungData);

  1. Antonio says:

    Here is the easy way : WebClient client = new WebClient(); client.Encoding = Encoding.UTF8; string data = client.DownloadString("[some url]");