54
istence of well-known web directories. They may allow the tester
to download the web site structure, which is helpful when trying to
determine the configuration of web directories and how individual file
extensions are served. Other tools that can be used for this purpose
include:
• wget - http://www.gnu.org/software/wget
• curl - http://curl.haxx.se
• google for “web mirroring tools”.
Review Old, Backup and Unreferenced Files for
Sensitive Information (OTG-CONFIG-004)
Summary
While most of the files within a web server are directly handled by the
server itself, it isn’t uncommon to find unreferenced or forgotten files
that can be used to obtain important information about the infrastruc-
ture or the credentials.
Most common scenarios include the presence of renamed old ver-
sions of modified files, inclusion files that are loaded into the language
of choice and can be downloaded as source, or even automatic or
manual backups in form of compressed archives. Backup files can also
be generated automatically by the underlying file system the applica-
tion is hosted on, a feature usually referred to as “snapshots”.
All these files may grant the tester access to inner workings, back
doors, administrative interfaces, or even credentials to connect to the
administrative interface or the database server.
An important source of vulnerability lies in files which have nothing to
do with the application, but are created as a consequence of editing
application files, or after creating on-the-fly backup copies, or by leav-
ing in the web tree old files or unreferenced files.Performing in-place
editing or other administrative actions on production web servers may
inadvertently leave backup copies, either generated automatically by
the editor while editing files, or by the administrator who is zipping a
set of files to create a backup.
It is easy to forget such files and this may pose a serious security
threat to the application. That happens because backup copies may be
generated with file extensions differing from those of the original files.
A .tar, .zip or .gz archive that we generate (and forget...) has obviously
a different extension, and the same happens with automatic copies
created by many editors (for example, emacs generates a backup copy
named file~ when editing file). Making a copy by hand may produce the
same effect (think of copying file to file.old). The underlying file system
the application is on could be making “snapshots” of your application
at different points in time without your knowledge, which may also be
accessible via the web, posing a similar but different “backup file” style
threat to your application.
As a result, these activities generate files that are not needed by the
application and may be handled differently than the original file by
the web server. For example, if we make a copy of login.asp named
login.asp.old, we are allowing users to download the source code of
login.asp. This is because login.asp.old will be typically served as text
or plain, rather than being executed because of its extension. In oth-
er words, accessing login.asp causes the execution of the server-side
code of login.asp, while accessing login.asp.old causes the content of
login.asp.old (which is, again, server-side code) to be plainly returned
to the user and displayed in the browser. This may pose security risks,
Web Application Penetration Testing
since sensitive information may be revealed.
Generally, exposing server side code is a bad idea. Not only are you
unnecessarily exposing business logic, but you may be unknowingly
revealing application-related information which may help an attacker
(path names, data structures, etc.). Not to mention the fact that there
are too many scripts with embedded username and password in clear
text (which is a careless and very dangerous practice).
Other causes of unreferenced files are due to design or configuration
choices when they allow diverse kind of application-related files such
as data files, configuration files, log files, to be stored in file system
directories that can be accessed by the web server. These files have
normally no reason to be in a file system space that could be accessed
via web, since they should be accessed only at the application level,
by the application itself (and not by the casual user browsing around).
Threats
Old, backup and unreferenced files present various threats to the se-
curity of a web application:
• Unreferenced files may disclose sensitive information that can
facilitate a focused attack against the application; for example include
files containing database credentials, configuration files containing
references to other hidden content, absolute file paths, etc.
• Unreferenced pages may contain powerful functionality that can be
used to attack the application; for example an administration page
that is not linked from published content but can be accessed by any
user who knows where to find it.
• Old and backup files may contain vulnerabilities that have been fixed
in more recent versions; for example viewdoc.old.jsp may contain a
directory traversal vulnerability that has been fixed in viewdoc.jsp
but can still be exploited by anyone who finds the old version.
• Backup files may disclose the source code for pages designed to
execute on the server; for example requesting viewdoc.bak may
return the source code for viewdoc.jsp, which can be reviewed for
vulnerabilities that may be difficult to find by making blind requests
to the executable page. While this threat obviously applies to scripted
languages, such as Perl, PHP, ASP, shell scripts, JSP, etc., it is not
limited to them, as shown in the example provided in the next bullet.
• Backup archives may contain copies of all files within (or even
outside) the webroot. This allows an attacker to quickly enumerate
the entire application, including unreferenced pages, source code,
include files, etc. For example, if you forget a file named myservlets.
jar.old file containing (a backup copy of) your servlet implementation
classes, you are exposing a lot of sensitive information which is
susceptible to decompilation and reverse engineering.
• In some cases copying or editing a file does not modify the file
extension, but modifies the file name. This happens for example in
Windows environments, where file copying operations generate file
names prefixed with “Copy of “ or localized versions of this string.
Since the file extension is left unchanged, this is not a case where
an executable file is returned as plain text by the web server, and
therefore not a case of source code disclosure. However, these
files too are dangerous because there is a chance that they include
obsolete and incorrect logic that, when invoked, could trigger
application errors, which might yield valuable information to an
attacker, if diagnostic message display is enabled.
• Log files may contain sensitive information about the activities
of application users, for example sensitive data passed in URL
parameters, session IDs, URLs visited (which may disclose additional