How to write a simple scraper in PHP without Regex

Util, howto, parsing 5 Comments »

Web scrappers are simple programs that are used to extract certain data from the web. Usually the structure of the the pages is known so scrappers have reduced complexity compared to parsers and crawlers.

In this tutorial we are going to create a simple parser that extract the title and favicon from any html page.

Usually scrappers are based on regular expressions but we are going to avoid them because they are difficult to manage and sometimes they have unexpected results. We are going to use simple php string functions instead.
Read the rest of this entry »

Creating a Simple PHP Cache Script

Util, api, howto 8 Comments »

Cache is a programming concept that can be used in a various range of applications and for various purposes. A cache library can be used for storing database queries for later use, to store rendered pages to be served again without generating them again, or to save indexed pages in a crawler application to be processed by multiple modules.

    A cache mechanism is more simple that it might sound. It’s just a simple module that should implement 2 actions:
  • to store a value(identified by a key).
  • to retrieve a value if it’s not expired.
  • additionally it can offer a mechanism to invalidate a set of values or the entire cache.

In this tutorial we are going to create a disk cache script. It stores the string values in files, each value is stored in a file and it contains an additional file to store the expiration date. Performance wise, this is not the best approach, but the script is designed like that with a clear purpose: the additional file can be used to store additional attributes, beside the expiration date. Imagine an application that crawls pages, with different modules. Each time a module crawls the page, it adds it’s result to the additional file.
Read the rest of this entry »

Php Class to Retrieve Alexa Rank

api, parsing 4 Comments »

Alexa is a service acquired by Amazon which offers a web traffic report for websites. They retrieve the data from toolbar that can be installed in different browsers, centralize the data and display reports to anyone. The most important indicator is the Alexa Rank. It represents the rank of a webpage in a list of all the websites. It’s not the 100% accurate but it gives a good indication.

Alexa does not offer any free API to obtain Alexa Rank. However there is a simple method to obtain it in the same way the Alexa Toolbar does. All you have to do is to invoke the following url(replacing php-html.net with your domain): http://data.alexa.com/data?cli=10&dat=s&url=php-html.net
Read the rest of this entry »

Model View Controller in PHP

Patterns 18 Comments »

The model view controller pattern is the most used pattern for today’s world web applications. It has been used for the first time in Smalltalk and then adopted and popularized by Java. At present there are more than a dozen PHP web frameworks based on MVC pattern.

Despite the fact that the MVC pattern is very popular in PHP, is hard to find a proper tutorial accompanied by a simple source code example. That is the purpose of this tutorial.
Read the rest of this entry »

How to handle URLs in PHP

howto, parsing 7 Comments »

URL handling is one of the tasks you have to do from time to time in PHP. Sometimes you have to do it because you want to record the referral sites, other times because you want to write your own spider or just because you want to retrieve your current URL.

PHP is a language developed around web for web developers and it contains all the functions you might need in your quests. There is a section in php documentation which groups the URL functions. Along with a few functions used to encode/decode which are rarely used the package contains the functions you can not live without:
Read the rest of this entry »

How to Write a PHP Script to Run Shell Commands from Browser

howto 3 Comments »

It happens pretty often for me to have to run shell commands in a hosting environment. I do it all the time via a simple php script. I tested it on godaddy and dreamhost and on other hostings environments and it works fine.

Before starting the tutorial you should note that if this script is not handled carefully it can have undesired results. A wrong rm command can delete all the files you have on your hosting, so run the commands with care.
Read the rest of this entry »

How to Identify Duplicate and Similar Text in Php

howto 1 Comment »

It’s not a common problem but sometimes you have to check if 2 texts are similar. If you have to aggregate data from multiple sources you might know what I’m talking about.

The most simple thing you can try is to simply compare the 2 strings. A simple comparison will not help if one of the strings are contains an extra space. A more serious algorithm should be used for such cases. Fortunately php provides us several functions that can be used.
Read the rest of this entry »

How to Send Mail From PHP

Mail No Comments »

Sending mails from PHP can raise certain problems. Usually the mail is sent from php through a simple function PHP function: mail(…). However the function needs a module that should be enabled in the php ini file. Not all the hosting providers enable it and you can not make changes in php admin area in the php shared hosting. In this case the another alternative should be used. There are 2 php libraries I’ve tried to send a mail which will be described in this tutorial: PHPMailer and Mail PEAR Package.
Read the rest of this entry »

Svn Commands

Uncategorized No Comments »

svn co
svn add –force .
svn commit -m “this is a message”

show changed files:
svn log -r 1:2 -v

Ignore Command:

Ignore a file:
svn propset svn:ignore .project .

Ignore a directory and all the files inside it:
svn propset svn:ignore ‘*’ classes/

If the location of the repository is changed you can use the following command to switch the local working copy(checked out from the old location of the repository), to the new one:
svn switch –relocate svn://oldlocation.com/svn/project/trunk http://newlocation.org/repository/project/trunk

svn switch –relocate svn://anonsvn.opensource.apple.com/svn/webkit/trunk \
http://svn.webkit.org/repository/webkit/trunk

Dates in PHP and MySQL using DATETIME fields

Date & Time 1 Comment »

This post covers the basic date & time operations which most of us are using in PHP and MySQL. The manipulation of the date and date & time fields in PHP is difficult because PHP is not very rich in date & time functions. This is why in most situations we have to write additional functions to subtract dates, calculate time intervals, calculate the difference between datetime elements and the list goes on.

Unlike PHP, MySQL is very rich in functions and supports for datetime elements. Its very good that we can always find what we need but this comes with a price: having so many options make the decision difficult.

In PHP there are 4 functions which can be used to handle most of the datetime operations:

  • time() - returns the current time measured in the number of seconds since January 1 1970 00:00:00 GMT.
  • mktime($hour,$minute,$second,$month,$day,$year) - returns the time specified through arguments measured in the number of seconds since January 1 1970 00:00:00 GMT.
  • date($format,[$timestamp]) - transforms from a time variable produced by time or mktime into a formated string
  • strtotime - transforms from a string to a time variable.

Lets start this tutorial creating an mysql table containing a DATETIME field: Read the rest of this entry »

Design by j david macor.com.Original WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in

Download from Free Wordpress templates