Deprecated: The behavior of unparenthesized expressions containing both '.' and '+'/'-' will change in PHP 8: '+'/'-' will take a higher precedence in /home/iano/public_html/tpforums-vb5/forum/includes/class_core.php on line 5842

PHP Warning: Use of undefined constant MYSQL_NUM - assumed 'MYSQL_NUM' (this will throw an Error in a future version of PHP) in ..../includes/init.php on line 165

PHP Warning: Use of undefined constant MYSQL_ASSOC - assumed 'MYSQL_ASSOC' (this will throw an Error in a future version of PHP) in ..../includes/init.php on line 165

PHP Warning: Use of undefined constant MYSQL_BOTH - assumed 'MYSQL_BOTH' (this will throw an Error in a future version of PHP) in ..../includes/init.php on line 165

PHP Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in ..../includes/functions_navigation.php on line 588

PHP Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in ..../includes/functions_navigation.php on line 612

PHP Warning: Use of undefined constant misc - assumed 'misc' (this will throw an Error in a future version of PHP) in ..../global.php(29) : eval()'d code(6) : eval()'d code on line 1

PHP Warning: Use of undefined constant index - assumed 'index' (this will throw an Error in a future version of PHP) in ..../global.php(29) : eval()'d code(6) : eval()'d code on line 1

PHP Warning: Use of undefined constant misc - assumed 'misc' (this will throw an Error in a future version of PHP) in ..../includes/class_bootstrap.php(1422) : eval()'d code(4) : eval()'d code on line 1

PHP Warning: Use of undefined constant index - assumed 'index' (this will throw an Error in a future version of PHP) in ..../includes/class_bootstrap.php(1422) : eval()'d code(4) : eval()'d code on line 1

PHP Warning: Use of undefined constant onlinestatusphrase - assumed 'onlinestatusphrase' (this will throw an Error in a future version of PHP) in ..../includes/class_core.php(4684) : eval()'d code on line 6

PHP Warning: Use of undefined constant onlinestatusphrase - assumed 'onlinestatusphrase' (this will throw an Error in a future version of PHP) in ..../includes/class_core.php(4684) : eval()'d code on line 6

PHP Warning: Use of undefined constant onlinestatusphrase - assumed 'onlinestatusphrase' (this will throw an Error in a future version of PHP) in ..../includes/class_core.php(4684) : eval()'d code on line 6

PHP Warning: Use of undefined constant onlinestatusphrase - assumed 'onlinestatusphrase' (this will throw an Error in a future version of PHP) in ..../includes/class_core.php(4684) : eval()'d code on line 6

PHP Warning: Use of undefined constant onlinestatusphrase - assumed 'onlinestatusphrase' (this will throw an Error in a future version of PHP) in ..../includes/class_core.php(4684) : eval()'d code on line 6

PHP Warning: Use of undefined constant onlinestatusphrase - assumed 'onlinestatusphrase' (this will throw an Error in a future version of PHP) in ..../includes/class_core.php(4684) : eval()'d code on line 6

PHP Warning: Use of undefined constant onlinestatusphrase - assumed 'onlinestatusphrase' (this will throw an Error in a future version of PHP) in ..../includes/class_core.php(4684) : eval()'d code on line 6

PHP Warning: Use of undefined constant onlinestatusphrase - assumed 'onlinestatusphrase' (this will throw an Error in a future version of PHP) in ..../includes/class_core.php(4684) : eval()'d code on line 6

PHP Warning: Use of undefined constant onlinestatusphrase - assumed 'onlinestatusphrase' (this will throw an Error in a future version of PHP) in ..../includes/class_core.php(4684) : eval()'d code on line 6

PHP Warning: Use of undefined constant onlinestatusphrase - assumed 'onlinestatusphrase' (this will throw an Error in a future version of PHP) in ..../includes/class_core.php(4684) : eval()'d code on line 6
Data mining from a website
Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: Data mining from a website

  1. #1

    Data mining from a website

    Does anyone know how to collect information from a webpage?

    Like for example, in my tool, if I select "Dragon" and press the "Get EXP info" button, I want the tool to print "700".
    (http://tibia.wikia.com/wiki/Dragon)

    How do I go about doing that? I'm hoping there are members that are experienced in this area.

  2. #2
    Super Moderator
    Join Date
    May 2007
    Posts
    1,191
    Regular Expressions

    C#:
    Code:
    Regex hitpoints = new Regex("</span>(?<hitpoints>[^<]*)<a href=\"/wiki/Hit_Points\"");
    Regex experience = new Regex("<br />(?<experience>[^<]*)<a href=\"/wiki/Experience\"");
    System.Net.WebClient webclient = new System.Net.WebClient();
    string str = webclient.DownloadString("http://tibia.wikia.com/wiki/Dragon");
    Console.WriteLine(experience.Match(str).Groups["experience"].Value);
    You may want to remove any trailing whitespaces. You might also want to group the patterns together, I just seperated them to make it easier to read.

    edit: the patterns can of course be improved, I just tend to read until open/close tag when it comes to HTML out of laziness
    i.e. only grabbing numbers would make removing whitespaces superflous
    Last edited by Blaster_89; 07-01-2013 at 09:47 PM.

  3. #3
    Holy crap, it works.
    I totally thought it'd require a LOT more code than that little snippet.

    Thanks Blaster_89, now I can do what I really want to do.

  4. #4
    Super Moderator
    Join Date
    May 2007
    Posts
    1,191
    Here are the improved patterns:
    Code:
    Regex hitpoints = new Regex(">(?<hitpoints>[^\\D]*)(?:[^<]*)<a href=\"/wiki/Hit_Points\"");
    Regex experience = new Regex(">(?<experience>[^\\D]*)(?:[^<]*)<a href=\"/wiki/Experience\"");
    ?<name> gives the group a name, makes for better readability
    [^\\D]* means read until a non-digit character appears
    ?: means don't capture this group
    [^<]* means read until < appears

    edit: double backslashes are necessary as escape characters exists, the actual pattern for a non-digit char is \D

  5. #5
    Oh okay, thanks a bunch man!
    By the way, do you have a list of those other keyword "types", like the \\D you had?

  6. #6

  7. #7
    Thanks again! You've been a big help

  8. #8
    Senior Member
    Join Date
    Jan 2008
    Location
    Cambridge, England
    Posts
    725
    Not sure if TibiaML XML API is still running, but that would be my choice of data mining site. It was provided pretty much for this purpose, and your safest bet would be to download everything and stick it in your own XML or SQL database... Keeping up to date is always a pain though...

  9. #9
    Quote Originally Posted by XtrmJash View Post
    Not sure if TibiaML XML API is still running, but that would be my choice of data mining site. It was provided pretty much for this purpose, and your safest bet would be to download everything and stick it in your own XML or SQL database... Keeping up to date is always a pain though...
    I would have done some kind of file storage, but "Keeping up to date is always a pain though..." is the reason why I want to do something like this.

  10. #10
    Senior Member
    Join Date
    Jan 2008
    Location
    Cambridge, England
    Posts
    725
    Quote Originally Posted by Evan View Post
    I would have done some kind of file storage, but "Keeping up to date is always a pain though..." is the reason why I want to do something like this.
    Well that was the best bit about the TibiaML API, it was updated by TibiaML community, so it used to be just a case of acquiring a list of items from one section of it, then acquiring the details from another. The images were also hosted there, so you could just stick it on a site and leave it be. I think Rydan @ XenoBot forums has some method of getting item data, but I guess he made his own database using dat / spr files from Tibia.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •