Sunday, June 3, 2007

MSHTML, Dom, and copying objects from a WebBrowser Control

While working on grabbing an image from a WebBrowser (talked about in my last post), I stumbled upon this method. It does work; however, it has one draw back: It only works for single framed images (ie. No animated images). Though, to be honest, I didn't play around all that much to see if I could finagle it to work with animated images.

I found this method of a foreign site and half the code was gone so I had to drum up the rest by sorting through the MSDN help files. I'm not quite sure if anyone else has looked up any DOM type stuff there, but as far as I can tell it's all quite dated and any real useful information is sparse and hard to come by. Luckily, with time I was able to piece together enough to determine the appropriate castings.

Alright, you have a WebBrowser control, and there's a particular element on it that you want. If it's text, I'd suggest just parsing the DocumentText variable of the control, but if it's an image, then this is a definite possible solution if you don't feel comfortable with the one here (the previous post going over a little bit simpler method which also handles sessions).

Before posting the code (straight from my project), I would like to point out that I didn't end up using this method and it was mostly testing. I still think that for some people it might pose useful and so I'm posting it. I still suggest looking at the previous post that covers a bit easier method before using this.

Code:
mshtml.IHTMLDocument2 doc = (mshtml.IHTMLDocument2)
    webBrowser1.Document.DomDocument;
mshtml.IHTMLSelectionObject sobj = doc.selection;
mshtml.HTMLBody body = doc.body as mshtml.HTMLBody;
sobj.empty();
mshtml.IHTMLControlRange range = body.createControlRange() as
    mshtml.IHTMLControlRange;
mshtml.IHTMLControlElement img = (mshtml.IHTMLControlElement)     webBrowser1.Document.Images[0].DomElement;

range.add(img);
range.select();
range.execCommand("Copy", false, null);

Bitmap bimg = new Bitmap(Clipboard.GetImage());
pictureBox1.Image = bimg;

9 comments:

Jakub said...

awesome code mate, no one else seems to have known how to do it, thanks!

seyed vahid said...

I have problem with your code.
whenever I try to execute this code I will get the following error :

"Object reference not set to an instance of an object."

in this line :
mshtml.IHTMLDocument2 doc = (mshtml.IHTMLDocument2) webBrowser1.Document.DomDocument;

can you help me what is my problem ?

Patrice said...

Add Reference:
Microsoft HTML Object Library

Patrice said...

--------- All images.----------

private void GetObjects()
{

mshtml.IHTMLDocument2 doc = (mshtml.IHTMLDocument2)
brIE.Document.DomDocument;
mshtml.IHTMLSelectionObject sobj = doc.selection;
mshtml.HTMLBody body = doc.body as mshtml.HTMLBody;
sobj.empty();
mshtml.IHTMLControlRange range = body.createControlRange() as mshtml.IHTMLControlRange;
mshtml.IHTMLControlElement img;
string name = "";
Uri u;
for (int im = 0; im < brIE.Document.Images.Count - 1; im++)
{
img = (mshtml.IHTMLControlElement)brIE.Document.Images[im].DomElement;

u = new Uri(brIE.Document.Images[im].GetAttribute("src"));

name = u.Segments[u.Segments.Length - 1].ToString();

if (!IsGoodToSave(name)) continue;

range.add(img);
range.select();

range.execCommand("Copy", false, null);
Bitmap bimg = new Bitmap(Clipboard.GetImage());
bimg.Save(Environment.TickCount.ToString() + "_" + name);
bimg.Dispose();
Clipboard.Clear();


}

}

Patrice said...

private Boolean IsGoodToSave(string name)
{
Boolean rtn = false;
string NAME = name.ToUpper();

if (NAME.IndexOf(".JPG") == NAME.Length - 4)
rtn = true;


return rtn;

}

Jordan said...

I have been looking everywhere for this solution, thank you!!!

jiri said...

Thank you so much !
Excelent example.

Michael Stanford said...

I'm Looking for the same code for more than a week but for visual basic .net , could you tell me how i can make it work on vb .net ? . many thanks in advance

huynhfxvn said...

thanks :D