Debugging ePerl

Last time I reported my wish to have true 500 errors upon script errors. I reported this rewrite by MarginalHacks, but also that POST data was lost.

So, I digged into the code and here are my results.

Setting the current script

ePerl reads the script from the arguments. However, in CGI mode, the script to read from is set in the environment variable, not in argument.

What is weird is that sometimes, this is unnecessary as the filename is already captured in the $ENV{'PATH_TRANSLATED'} block just before the arguments parsing. Indeed, it started working just as I first published this post. Anyway, argument parsing is then ignored, which is a problem, and the PATH_TRANSLATED does not give the filename, but a path which is not the script if you have mod_rewrite or use PATH_INFO (as I do both, you can imagine how wrong it is). So:

@@ -161,13 +164,6 @@
   $opt{'perl'} = $^X;
   $opt{'CaseDelimiters'} = 1;
-    # We're being called in a CGI environment, so @ARGV contains
-    # the search keywords, not the files or options to process
-    @files = ($ENV{'PATH_TRANSLATED'});
-    # Check for "nph-"
-    $opt{'mode'} = basename($ENV{'PATH_TRANSLATED'}) =~ /^nph-/ ? "n" : "c";
-  } else {
     while ($#ARGV>=0) {
       my $arg=shift(@ARGV);
       if ($arg =~ /^-(h|-help)$/) { usage(\%opt); }
@@ -209,18 +206,17 @@
     # Mode if not specified
     $opt{'mode'} = "f" unless ($opt{'mode'});
     $opt{'mode'} = "f" if ($opt{'mode'} =~ /^filter$/i);
-    $opt{'mode'} = "c" if ($opt{'mode'} =~ /^cgi$/i);
-    $opt{'mode'} = "n" if ($opt{'mode'} =~ /^nph-cgi$/i);
+    $opt{'mode'} = "c" if ($opt{'mode'} =~ /^cgi$/i || $ENV{"SCRIPT_FILENAME"});
+    $opt{'mode'} = "n" if ($opt{'mode'} =~ /^nph-cgi$/i || $ENV{"SCRIPT_SRC_PATH_FILE"} =~ /^nph/i);
     # And check for it based on PROGNAME
     $opt{'mode'} = "n" if ($PROGNAME =~ /^nph-/i);
-  }
   usage(\%opt,"Unsupported mode: $opt{'mode'}") unless ($opt{'mode'} =~ /^[fcn]$/);
   if ($opt{'mode'} ne "f") {
     CGI::Carp->import('fatalsToBrowser');	# Output HTML for errors
     $opt{'convert-entity'} = 1;
     $opt{'preprocess'} = 1;
+    @files = $ENV{"SCRIPT_FILENAME"} if $opt{'mode'} ne "f";
       foreach my $file (@files) {
         usage(\%opt,"File `$file' is not allowed to be interpreted by ePerl (wrong extension!)",1)

Just remove that test! Try some other defaults as well, it seems to work, but should be more extensively tested.

Not loosing STDIN

Actually there is an option for this: --tmpfile. Which leads us to the second point:

Multiple shebang options

After adding --tmpfile to the shebang already containing --mode=CGI, I got a true 500 error from the server, and a usage report in the logs. It looks like arguments are not splitted on whitespace in the shebang. Seems to be a known bug (or even feature?). So we have to split the arguments directly in the script. Here is the diff:

@@ -169,6 +172,7 @@
     $opt{'mode'} = basename($ENV{'PATH_TRANSLATED'}) =~ /^nph-/ ? "n" : "c";
   } else {
+    @ARGV = split(" ", @ARGV[0]) if $#ARGV == 1; # weird behaviour in shebang: all args passed as only one, so split them here!
     while ($#ARGV>=0) {
       my $arg=shift(@ARGV);
       if ($arg =~ /^-(h|-help)$/) { usage(\%opt); }

Returning 500 code upon error

As mentionned previously, the script returns a 200 OK when an error happens in the script. This is annoying, because the page ends up indexed in Google (sample query. This can be a security issue, as the information disclosed could be used by an attacker to find a flaw in the code! (Actually it shouldn't create any flaw, it just discloses information to make it slightly easier to discover an existing flaw.) The HTTP spec defines the 500 Internal Server Error code to employ in this case of a server error. Hopefully it is an easy change:

@@ -1139,12 +1143,15 @@
   if ($opt_H->{'mode'} eq "n") {
     my $proto = $ENV{'SERVER_PROTOCOL'} || "HTTP/1.0";
-    print SEND_OUT "$proto 200 OK\n";
+    print SEND_OUT "$proto 500 Internal Server Error\n";
     my $server = $ENV{'SERVER_SOFTWARE'} || "unknown-server/0.0";
     print SEND_OUT "Server: $server ePerl/$VERSION Perl/$]\n";
     print SEND_OUT "Date: ".localtime(time)."\n";
     print SEND_OUT "Connection: close\n";
+  else {
+    print SEND_OUT "Status: 500\n";
+  }
 Content-Type: text/html

Note that the 500 code is now returned in both CGI and NPH-CGI modes.

Custom error page

The last step was to customize the error page. Right, the page provides useful information, but it is especially ugly!

I introduced a new --errorscript argument. The script in argument will be called and its output displayed in place of the built-in error message. It involves several changes:

@@ -129,6 +129,9 @@
                             (If script needs to read stdin, like a post .cgi)
   -1, --eval                Run in a single process using `eval'
                             (default for MSWin32 as can't fork)
+  -S, --errorscript=PATH    use a custom error script rather than the buit-in default
+                            message. Warning: do not use this argument in the error
+                            page itself, or you could end up with an infinite loop!
   --                        Following options are args to the ePerl script
@@ -196,6 +200,7 @@
       if ($arg =~ /^-(s|-strict)$/) { $opt{'strict'}=1; next; }
       if ($arg =~ /^-(t|-tmpfile)$/) { $opt{'tmpfile'}=1; next; }
       if ($arg =~ /^-(1|-eval)$/) { $opt{'eval'}=1; next; }
+      if ($arg =~ /^-(S|-errorscript=)(.+)?$/) { $opt{'errorscript'}=arg(); next; }
       if ($arg =~ /^-(r|-readme)$/) { readme(); exit(0); }
       if ($arg =~ /^-(l|-license)$/) { license(); exit(0); }
@@ -1145,12 +1190,23 @@
     print SEND_OUT "Date: ".localtime(time)."\n";
     print SEND_OUT "Connection: close\n";
+  else {
+    print SEND_OUT "Status: 500\n";
+  }
+  if ($opt_H->{'errorscript'}) {
+    $ENV{'REDIRECT_STATUS'} = 500;
+    $ENV{'REDIRECT_ERROR_NOTES'} = "<pre>$error</pre>";
+    $ENV{'REDIRECT_ERROR_NOTES'} = "<pre>$error: @err</pre>" if @err;
+    my $errorscript = $opt_H->{'errorscript'};
+    print SEND_OUT `$errorscript`;
+  }
+  else {
 Content-Type: text/html
@@ -1186,6 +1242,7 @@
+  }
 sub readme {

Warning: do not define --errorscript to point to the script itself. If something goes wrong, you would end up in an infinite loop, I haven't included code to check that condition!

Custom error page again

Ok, let's try what happens when we die in the ePerl script… well, we see the output stop at the point we died, but no error message is displayed! This seems actually linked to the way the --tmpfile argument is processed: error output is not captured. I added a dirty hack to do that, with another temp file to hold STDERR.

@@ -503,17 +509,24 @@
 # Write to a tmpfile, execute that
 sub start_perl_tmpfile {
   my ($opt_H) = @_;
   my $file = "$TMPDIR/$PROGNAME.$$";
   usage($opt_H,"Tmpfile already exists?? [$file]",1) if (-f $file);
+  my $errfile = "$TMPDIR/$PROGNAME.err.$$";
+  usage($opt_H,"Tmperrfile already exists?? [$errfile]",1) if (-f $errfile);
   my $save = umask 077;		# Some added safety
   $opt_H->{'ph'} = new IO::File;
+  $opt_H->{'pherr'} = new IO::File;
   usage($opt_H,"Couldn't create tmpfile [$file]",1)
     unless $opt_H->{'ph'}->open(">$file");
+  usage($opt_H,"Couldn't create tmperrfile [$errfile]",1)
+    unless $opt_H->{'pherr'}->open(">$errfile");
   $TMPFILE = $file;
+  $TMPERRFILE = $errfile;
   umask $save;
@@ -521,7 +534,10 @@
-sub clean_tmpfile { unlink $TMPFILE if $TMPFILE && -f $TMPFILE; }
+sub clean_tmpfile {
+  unlink $TMPFILE if $TMPFILE && -f $TMPFILE;
 sub interrupt { print STDERR "[$PROGNAME] **INTERRUPT**"; clean_tmpfile(); exit; }
 # Just open a normal pipe to a perl process, redirect STDOUT
@@ -797,9 +813,38 @@
     # Dangerous race condition here!
     usage($opt_H,"Tmpfile disappeared?? [$TMPFILE]",1)
       unless $TMPFILE && -r $TMPFILE;
-    system("$opt_H->{'perl'} $opt_H->{'perl_opts'} $TMPFILE @ARGV");
+    my $output = `$opt_H->{'perl'} $opt_H->{'perl_opts'} $TMPFILE @ARGV  2>$TMPERRFILE`;
     $ret = $?;
+    $opt_H->{'pherr'}->close;
+    my $exit = $ret >> 8;
+    my $int  = $ret & 127;
+    my $core = $ret & 128;
+    $exit|=0xffffff00 if $exit>>7;
+    $exit = sprintf("%d",$exit);
+	if ($exit || $int || $core) { # Ok, there was an error!
+		# read-open the error file
+		my $errfile = "$TMPDIR/$PROGNAME.err.$$";
+			usage($opt_H,"Tmperrfile already removed?? [$errfile]",1) unless (-f $errfile);
+		$opt_H->{'pherr'} = new IO::File;
+		$opt_H->{'pherr'}->open("<$errfile");
+		my $error = "";
+		$error .= "[$PROGNAME] Interpretor returned error [$exit]\n" if ($exit);
+		$error .= "[$PROGNAME] **INTERRUPT**\n" if $int;
+		$error .= "[$PROGNAME] (Core dump)\n" if $core;
+		$error .= "$opt_H->{'start_file'} syntax OK\n" if ($opt_H->{'syntax_check'} && !$ret);
+		if ($error && $opt_H->{'mode'} eq "f") {
+			print STDERR $error;
+		} elsif ($error) {
+			redirect_output($opt_H);
+			chomp $error;
+			html_error($opt_H,$error,$opt_H->{'pherr'}->getlines);
+			$opt_H->{'pherr'}->close;
+		}
+	} else {
+		print $output;
+	}
+    exit($exit); # Exit here directly
   } elsif ($opt_H->{'eval'}) {
     # eval method


This new ePerl works (it's already used to generate the pages you see!) I think I've fixed the main issues I had, but I'm not fully satisfied with it. There is too much useless code (for my needs) and dirty hacks. I might come with an ePerlLite in some future, but for now it will do.

You can download the diff if you're interrested. Just patch the original with patch eperl.diff and that's it, or get the script directly (this one is additionally converted to UTF-8).

Xavier Robin
Publié le samedi 19 juin 2010 à 15:56 CEST
Lien permanent : /blog/2010/06/19/debugging-eperl
Tags : Programmation
Commentaires : 1


Par term papers le vendredi 20 août 2010 à 16:28 CEST

I like the idea of looking at your categories its interesting and most about that – it’s good inspiration. I have been torn between the direction of what my newest site should be focused on and after reading this.thank you very much for your information provided. but I am still a little bit impressed

Nouveau commentaire

* L'astérisque dénote un champ obligatoire.

En soumettant votre message, vous acceptez qu' il soit publié sous licence CC BY-SA 3.0.

Quelques balises HTML sont autorisées : a[href, hreflang, title], br, em, i, strong, b, tt, samp, kbd, var, abbr[title], acronym[title], code, q[cite], sub, sup.

Switch to English



Bruit de fond Hobbys Humour Informatique Internet Livres Logiciels Moi Mon site web Mozilla Photo Politique Programmation Scolaire Ubuntu pROC

Billets récents