How to scrape mobile numbers from JustDial?

It’s a bit tricky to scrape mobile numbers from JustDial.com cause you can not view the numbers when you view the source code. They use a custom CSS font to display numbers.

Screenshots:


<span class="mobilesv icon-rq"></span> will output 5
<span class="mobilesv icon-wx""></span> will output 2 and so on.

So when scraping the category page, make these replacements using str_replace() you will get the correct number.

It’s easy for category page but for single listing, it is little more complicated. They use random CSS classes for the digits. Example: https://www.justdial.com/Mumbai/Fortis-Hospital-Opposite-Dmart-Mulund-West/022PXX22-XX22-120327123542-Z3U6_BZDET

For this page, each time to refresh the page, you see different source code for the phone number.

Screenshots:

Here’s the code that might be useful for scraping:


foreach($html->find('style') as $style) {
$styles[] = $style->innertext;
}
$style = $styles[1];

$style = substr($style, 0, strpos($style, ".mobilesv"));
$style = strstr($style, ".icon-");

$style = explode("}",$style);

for($i=0; $i<14; $i++)
{
//echo "$style[$i] <br />";
$style[$i] = str_replace(".icon", "icon", $style[$i]);
if($i==0) {$var0 = str_replace(':before{content:"\9d001"','',$style[$i]); }
if($i==1) {$var1 = str_replace(':before{content:"\9d002"','',$style[$i]); }
if($i==2) {$var2 = str_replace(':before{content:"\9d003"','',$style[$i]); }
if($i==3) {$var3 = str_replace(':before{content:"\9d004"','',$style[$i]); }
if($i==4) {$var4 = str_replace(':before{content:"\9d005"','',$style[$i]); }
if($i==5) {$var5 = str_replace(':before{content:"\9d006"','',$style[$i]); }
if($i==6) {$var6 = str_replace(':before{content:"\9d007"','',$style[$i]); }
if($i==7) {$var7 = str_replace(':before{content:"\9d008"','',$style[$i]); }
if($i==8) {$var8 = str_replace(':before{content:"\9d009"','',$style[$i]); }
if($i==9) {$var9 = str_replace(':before{content:"\9d010"','',$style[$i]); }
if($i==10) {$varplus = str_replace(':before{content:"\9d011"','',$style[$i]); }
if($i==11) {$varhyphen = str_replace(':before{content:"\9d012"','',$style[$i]); }
if($i==12) {$varpclose = str_replace(':before{content:"\9d013"','',$style[$i]); }
if($i==13) {$varpopen = str_replace(':before{content:"\9d014"','',$style[$i]); }
}

foreach ($html->find('#comp-contact .telnowpr .tel') as $contactinfo) {
$n++;

echo "<p>$n) Phone Number: $contactinfo</p>";

$contactinfo = str_replace("$var0","0",$contactinfo);
$contactinfo = str_replace("$var1","1",$contactinfo);
$contactinfo = str_replace("$var2","2",$contactinfo);
$contactinfo = str_replace("$var3","3",$contactinfo);
$contactinfo = str_replace("$var4","4",$contactinfo);
$contactinfo = str_replace("$var5","5",$contactinfo);
$contactinfo = str_replace("$var6","6",$contactinfo);
$contactinfo = str_replace("$var7","7",$contactinfo);
$contactinfo = str_replace("$var8","8",$contactinfo);
$contactinfo = str_replace("$var9","9",$contactinfo);
$contactinfo = str_replace("$varplus","+",$contactinfo);
$contactinfo = str_replace("$varhyphen","-",$contactinfo);
$contactinfo = str_replace("$varpclose",")",$contactinfo);
$contactinfo = str_replace("$varpopen","(",$contactinfo);
$contactinfo = str_replace("</span>","",$contactinfo);
$contactinfo = str_replace("\"","",$contactinfo);
$contactinfo = str_replace("<span class=mobilesv ","",$contactinfo);
$contactinfo = str_replace("<a class=tel mtel","",$contactinfo);
$contactinfo = str_replace("<a class=tel ttel","",$contactinfo);
$contactinfo = str_replace("</a>","",$contactinfo);
$contactinfo = str_replace(">","",$contactinfo);
echo "<p>&nbsp; &nbsp; Phone Number: <span class=\"phone\"><b>$contactinfo</b></span></p>";
$contactinfo = "=\"$contactinfo\"";

if ($n==1) {
$contactnumbers = $contactinfo;
}
else {
$contactnumbers = $contactnumbers.",".$contactinfo;
}

}

5 Replies to “How to scrape mobile numbers from JustDial?”

      1. Thanks for the prompt reply.
        Does the above code work for single listing also?
        As you said they use random css classes for the digits

        1. I last used the code in August. If they haven’t made any changes to the site, it should work.

          FYI: They block IP if you send too many requests. I used xampp to run the PHP file on my computer, and if I got blocked, I restarted my modem to get a different IP.

        2. Just checked my scrapper, it is not working properly. It seems they have modified their CSS a bit. Now the CSS rules are jumbled. The sequence 1 2 3 is not being followed, my code works when the CSS is in sequence.

          When this is the case:
          .icon-fe:before{content:”\9d001″
          .icon-ji:before{content:”\9d002″
          .icon-acb:before{content:”\9d003″
          .icon-yz:before{content:”\9d004″
          .icon-wx:before{content:”\9d005″
          .icon-nlm:before{content:”\9d006″
          .icon-dc:before{content:”\9d007″
          .icon-ba:before{content:”\9d008″
          .icon-ts:before{content:”\9d009″
          .icon-trs:before{content:”\9d010″
          .icon-wyx:before{content:”\9d011″
          .icon-nm:before{content:”\9d012″
          .icon-ikj:before{content:”\9d013″
          .icon-hg:before{content:”\9d014″

          it works. Otherwise it doesn’t. For example:

          .icon-rq:before{content:”\9d002″
          .icon-ikj:before{content:”\9d004″
          .icon-po:before{content:”\9d001″
          .icon-ba:before{content:”\9d005″
          .icon-ji:before{content:”\9d010″
          .icon-dc:before{content:”\9d007″
          .icon-yz:before{content:”\9d009″
          .icon-ts:before{content:”\9d003″
          .icon-trs:before{content:”\9d006″
          .icon-nm:before{content:”\9d008″
          .icon-nlm:before{content:”\9d011″
          .icon-fde:before{content:”\9d012″
          .icon-acb:before{content:”\9d013″
          .icon-hg:before{content:”\9d014″

          You can tweak the code to make it work. It’s still easy to scrap numbers.

Leave a Reply

Your email address will not be published. Required fields are marked *