CGI stands for "Common Gateway Interface" - simply a type of interfacing protocol that is already running on most hosting servers for you and "a term you don't really need to know much about".
In short, CGI defines how web servers and web browsers handle information from HTML forms on web pages. That's simplifying it, but you get the point. In the broader sense, however, the term 'CGI' is often used to mean "any program that runs on a web server and interacts with a web browser".
You may hear someone ask, "Where can I get a
CGI Script to handle these Forms?" or they say, "Use CGI to do what you need". What they are referring to, in error, is a program that is already running on your web server. What they really should say is, "Get a Perl Script to handle your interactivity scripting needs."
Okay, so a web server spends most of its time answering requests, loading the HTML page that a user is requesting, and sending it to them. Nothing too complicated. But this isn't very exciting, is it?
What if we want the user to see something different every time they load a page? What if we want to ask a user for information, and save it to a Flatfile Database? What if we want to display
Database Information from a file that may change 5 times a day?
In situations like this, loading a static (non-changing) page just isn't good enough. We need the web server to run a program, take some action, and then send a results page back to the user's web browser. The results page might be different every time the program is run.
Let's take an example . . .
You create a web page in your browser that has a form in it, asking the user's name and email address. There is also a 'Submit' button. When the user presses submit, their information should be sent to you in an email and or saved in a file on the web server that you can view later; and they should get a 'Thank You' screen back.
This is a basic form. An HTML Tutorial is good if you need to learn more about
Forms in HTML. If you don't understand forms, read up on them first. You can't really tackle CGI and Perl scripts without understanding forms.
In your original HTML document, you will have a <FORM> and some <INPUT> tags. For example:
Name: <INPUT NAME="realname">
Email: <INPUT NAME="email">
<INPUT TYPE="SUBMIT" VALUE="Submit Form">
<INPUT TYPE="RESET" VALUE="Reset Form">
Note: I recommend that you create these special directories under your
on the server under the cgi-bin for every different Perl Script Package you install.
This is due to the fact that over time two installed scripting programs might both use a data storage file that is called "data.txt" or "data.dat". This could cause real problems between their operations if both were in the same /cgi-bin/ directory!
Note that some scripts HAVE to be directly in the
/cgi-bin/ but is very rarely an absolute requirement.
(Nothing beats Pre-Planning and Good Directory Management.)
The <FORM> tag has two parameters that are important for us.
METHOD tag defines how the browser will send the information to the server, and how the web server will send it to your program. It can either be "POST" or "GET" - you will most often see "POST". For a full explanation of the difference, you need a longer tutorial or a book.
The other parameter, ACTION, is the URL of the program on the server that will process the information sent from the form and do something with it.
Here is the process in a nutshell . . .
- Sent From the
Form on the Web Page
When the user hits the form 'Submit' button the web browser makes a connection to the server, requests the URL in the 'ACTION' parameter via the CGI "in", and also sends all the Form Value Pairs associated to what the user entered into the Form.
Processing on the Server
The web server looks at the URL, realizes it is an executable Perl Program rather than a static HTML or text file, and runs it using the Perl Interpreter on the server.
- Results Returned Back to Browser
The program then grabs all the data sent to it, does something - i.e. search a database, and returns HTML results back via the CGI "out" to the browser as a response to the user.
That's it! That's the basic process that almost all CGI "Perl" scripts are going to perform.
CONFIGURATION ON THE
In order for all this to happen, you need to make sure your web server is setup to handle this whole thing. Contact your web hosting company to determine if this is a provided feature on your server account.
When you (your browser, actually) request a URL from a server, the server needs to do some checking to find out what to do. How does the server know if the URL you are requesting is a static file it should just load and send, or if it's a program it should run and send to you? This is typically decided by three factors: Which directory the file is in, its file extension and does it have executable
First, let's look at the directory part. If you're reading this, you've no doubt heard of a 'cgi-bin' directory, and noticed that most CGI scripts need to be in this directory. Why? Well, this is a server configuration issue. The server is setup to know that any file in this directory is a program to run, and not a static file to send to the browser. Usually, you can't even put a regular HTML file in this directory, because when the server tries to load it, it will try to run it as a program rather than just send it as a file.
Are you curious where the name 'cgi-bin' came from? Well, it goes back to the original days of the NCSA web server. By default, this web server had two directories: cgi-src and cgi-bin. The first contained source code for CGI programs that could run on the server. The second contained the binaries (compiled executables) of the programs, which could be run on the server. Web servers typically don't have the cgi-src directory anymore, but the name cgi-bin has stuck around as the 'default' place to put executable CGI programs on a web server.
Now let's look at the second factor to determine whether a web server runs the file or loads it as a static file: the file extension.
The extension of a file on the server - .html, .cgi, .pl, .txt, etc... - tells the server what kind of file it is and how to handle it. It knows that .html and .txt files are plain text static files that should just be sent to the browser, for example. You can add your own file extensions through the web server's configuration options, and tell it how to handle those files. The .cgi extension is one example of an extension that the web server is configured to recognize as a program it should run.
Okay, now let's take another look at the <FORM> line from our example above:
When the web browser sends its request to the ACTION URL, the web server sees that it is in the cgi-bin directory, and its extension is .cgi, .pl, etc... - so it knows that this is a program that it should run. So it hands off a request to the operating system and the
Perl Interpreter telling them to run the program, and also passes all the form data to your Perl script program to be "processed".
Makes perfect sense, doesn't it?
RUNNING THE PROGRAM
We're now at the point where the web server has decided it should run the CGI program, and its made the request to the Operating System to execute the file. This is where a lot of problems start happening, because there are a lot of things that need to be exactly correct in order for the program to run successfully and send the output back to the web browser. Some of these potential problems are specific to
UNIX, and some are specific to Windows NT (I won't go into other operating systems because these two are the most common). I'll just go down the list of things that need to be correct in order for this to work.
1. The file needs to be executable
In Unix, files have attributes that don't exist in the Windows NT world. One of these is the executable bit. Each file has a setting that tells the operating system whether it can be executed as a program or not, and whether it can be run by only the file owner, only the group that the file owner is in, or by everyone on the server. In order for the operating system to run the file, it needs to be marked as 'executable' by Everyone. This is what the 'chmod' command does. I won't go into detail about how chmod works, but when you see an instruction that says something like 'do a chmod 755 on the program.cgi file', what it is telling you is to make the file executable by everyone on your server, so it can be run from the web server. For more information on permissions check
Setting File Permissions
2. The file needs to point to a valid executable
For .cgi files, the server knows to run it as a program, but it needs to know HOW to run it. If it's a compiled executable, there's no problem - it just runs it. But if it's a run-time script using a programming language like Perl, it needs to know where to find the Perl Interpreter that will assist the run-time Perl scripting program . This is the function of the first line of the file.
In Unix, this points to an executable file (in this case, the program is named 'perl') that will run your script. The first two characters - #! - is called a
shebang, and it's common Unix syntax. If your script starts with the line above, and your server doesn't have a program called /usr/local/bin/perl, the whole thing will die and you'll get an error back. For perl scripts, the line above is typical, and most servers have /usr/local/bin/perl.
But in some rare instances, things are configured differently and you need to edit this first line to point to a valid program to run. This would be a good second try at the shebang line:
3. The program needs to return a valid response
Any CGI programming that runs needs to return a valid response to the browser. If it encounters a problem while running and dies, it could output an error message, however. If you were running the program in a normal window on NT or Unix, you would simply see the error message. In the web world, however, the program needs to hand its response back to the web server, who then packages it up to send back to the browser. If the program outputs an error message, the web server does not get the response it expects and instead returns a general error (500 - Internal Server Error, for example) back to the browser saying there was a problem running the program.
SERVER SIDE PROCESS
Now that you understand how things need to be setup, it's a good time to step through the whole process and see exactly what happens when a CGI script is run. Going back to our original example, here is the sequence of events (assuming a Unix server):
- The browser requests the URL in the ACTION tag, and passes all the data along with the request.
- The server recognizes that .cgi means that this file should be run.
- It checks to make sure that CGI programs are allowed to run on the server.
- It checks to make sure that CGI programs are allowed to run in the /cgi-bin directory.
- It launches a sub-process to run the program in the operating system.
- The operating system opens the file and looks at the first line to see which program to use with the script.
- It runs this program and passes it the filename to run.
- The script runs, does whatever it needs to, and then returns an HTML response, using print() statements, for example.
- The whole response is passed back to the server which then packages it up in an HTTP response, including content length, etc.
- The server then passes the whole response back to the browser, which displays it to the viewer that requested the action.
Of course, that whole process doesn't always go as planned, and there are some things that can stand in the way of your program running correctly.
When the web server launches a sub-process to run the program (Step 5 above) it does a trick. It changes the User ID of who it is running as to a user that has very little or no permissions to do anything on the web server. This is for security purposes - so you can't write a script that over-writes important files on accident, or deletes whole directory trees. But this also creates a problem when your program tries to access files to read and write.
If you want your program to write to a file, you need to make sure it has permissions setup correctly for this user (usually a user named 'nobody') to write to it. Once again, you need to use the 'chmod' command. A command like 'chmod 777 filename.txt' will give Read, Write, and Execute permissions for the file for anyone on the server machine, so even when the server changes to the new user it will still have access to the file.
File Permissions are an important thing to remember when trying to setup someone's CGI / PERL script, and are often the cause of it not working correctly. Make sure to follow instructions on which file permissions are needed for which files in order to setup the script correctly.
The first thing a CGI / PERL script needs to output, assuming it's giving an HTML response back to the user, is "Content-type: text/html" followed by two returns (creating an empty line).
print "Content-type: text/html\n\n";
This is needed in any CGI / PERL script so that the web server knows what kind of data is being sent back to the browser and can handle it appropriately. The CGI / PERL script could actually return any type of response it wanted to - it could be plain text, or a PDF document, or a Microsoft Word file. But 99% of the time, the result of any CGI script is going to be plain HTML.
If the script runs and it outputs something other than the Content-type, the web server will return an error message to the browser saying the script returned an invalid response.
Any time you have a script that doesn't work, you need to go through a series of steps to figure out what is wrong. Most of the time, you'll be setting up a script that someone else wrote, so sometimes it can be difficult to figure out what is wrong. But if you follow a few steps, debugging it should be easier.
1. Make sure CGI / PERL scripts are in the right place.
If you aren't the Webmaster in charge of the web server, the very first thing to check is that your scripts are in the right place. All your files need to be in your account www space folder, and scripts must be in your cgi-bin - no exceptions!
2. Make sure file is uploaded in correct mode!
Executable scripts must be uploaded to the server as ASCII file type. If you upload a script file as Binary Type - It Will Not Work.
3. Make sure file is executable
See above. Make sure the file is executable! (
Setting Chmod File Permissions on Unix / Linux)
4. Check shebang line
The first line of your script needs to be #!/usr/local/bin/perl or #!/usr/bin/perl.
5. Ask your Webmaster for help
If you've checked all these things, email your webmaster and see if they can help you out. They may have some special things setup, or they might be able to give you some clues. Also, be sure to tell them you've tried the above steps - they will love you! Nothing is more aggravating than receiving a request for help from someone who has apparently done NOTHING to try to help themselves.
6. Contact the script author for help
If all else fails, contact the script author for help. We recommend you use this as a last resort - most folks who have contributed free scripts to web sites explicitly state "no support provided". Some offer to install their scripts for a fee, and if you've chosen not to pay it's not cool to ask for their help for free!
Hopefully we've gone into enough detail in this tutorial to help you out. If you want to know more, or really get into the nitty-gritty of things, check out (Sample Scripts and Sources) and learn through trial and error on your own CGI Permissions enabled server account.
An important thing to remember is that this can be a complicated subject and you shouldn't expect easy answers. Programming and or installing CGI /
PERL Scripts is more difficult and involved than writing simple HTML. If you're trying to make the jump from one to the other, make sure you've got the desire to really learn it and the knowledge to make it happen.
Learning CGI /
PERL Programming and making Perl Scripts work on your site can be a very satisfying experience. Hopefully this helped you along your way, and you'll have much success with it!
We host all our clients with Bluehost and fully recommend their services.