Received: (qmail 20478 invoked by uid 2012); 24 Nov 1998 10:49:09 -0000 Message-Id: <19981124104909.20477.qmail@hyperreal.org> Date: 24 Nov 1998 10:49:09 -0000 From: Rainer Scherg Reply-To: Rainer.Scherg@rexroth.de To: apbugs@hyperreal.org Subject: RefererIgnore in CustomLog X-Send-Pr-Version: 3.2 >Number: 3449 >Category: mod_log-any >Synopsis: RefererIgnore in CustomLog >Confidential: no >Severity: non-critical >Priority: medium >Responsible: apache >State: closed >Class: change-request >Submitter-Id: apache >Arrival-Date: Tue Nov 24 02:50:01 PST 1998 >Last-Modified: Thu Feb 18 04:14:04 PST 1999 >Originator: Rainer.Scherg@rexroth.de >Organization: >Release: 1.3.3 >Environment: Solaris 2.5, gcc >Description: Hi, to minimize the impact of referer information to the size of log-files, it would be nice to have a RefererIgnore for the CustomLogFiles. Logging Referer information of my own server often is not very exciting... Another idea: RefererReplace "local" www.xyz.de abc.xyz.de RefererReplace "mypartner" domain.somewhere [some sort of that, which means: any request from the above URLs/hosts/domains should be logged as ARG1 (URL-like). Advantage: A statistic of a log can show how many e.g. local requests without splitting them up into each single URL] BTW: IMO it would make sense to combine mod_log_referer and mod_log_config. tnx for listening -- Rainer >How-To-Repeat: >Fix: >Audit-Trail: State-Changed-From-To: open-suspended State-Changed-By: coar State-Changed-When: Tue Nov 24 03:03:15 PST 1998 State-Changed-Why: This is an issue that keeps coming up, but I think this is the first actual PR on it. I'm suspending this for future reference. Some of the developers would like to put conditional logging into mod_log_config so that the mod_log_referer and mod_log_agent modules could go away, but others don't want that in the core since the same functionality can be had by using piped logs with a filter process between the httpd daemon and the log file. The issue keeps gettinf raised, but neither side has entirely given up yet. :-) Release-Changed-From-To: 1.3.x-1.3.3 Release-Changed-By: coar Release-Changed-When: Tue Nov 24 03:03:15 PST 1998 From: "Rainer Scherg," To: "'apbugs@hyperreal.org'" , "'apache-bugdb@apache.org'" Cc: Subject: RE: mod_log-any/3449: RefererIgnore in CustomLog Date: Wed, 25 Nov 1998 09:40:27 +0100 Hi Ken, [sorry for the silly kind of quoting, but that's outlook] I'm nearly finished getting rid of the modules log_referer and log_agent. Therefor I will send you the coding I've done. But I will include some #ifdef code to map out the enhancements. Of course, you can have a lot of discussion to define a core functionality for a server - but IMHO what matters is to get a server to top performance. Using external progs/scripts do to the funtionality always as performance penalty (and our server already have to much load). [OK depends on: I would compile Oracle and PHP into apache, because the apache prog would become to large - that takes to much time in spawning processes.] In my experience, many developers are always thinking of ideal systems and not of what is really needed. E.g.: Right now we have the standard discussion whether to use apache (Solaris) or MIIS on NT - and the only way to get rid of the discussion is to show that apache is more powerfull and easier to handle for standard users (providing HTML stuff) than MIIS - and this means real world funtionality to apache. (example: PR#3430) -- Rainer BTW: Funny From-address 8-/ [or a problem with outlook] -----Original Message----- From: Rodent of Unusual Size Sent: Tuesday, November 24, 1998 7:26 PM To: Rainer Scherg, Subject: Re: mod_log-any/3449: RefererIgnore in CustomLog Rainer Scherg, wrote: > > Right now I'm doing some code patches to mod_log_config. > I'll forward you the results ASAP. > But some code changes will still to be done > (e.g. bind RefererIgnore to a LogFileFormat, etc.). The thing keeping it from happening in the core distribution isn't a technical issue but a philosophical one. We've had various versions that did the work before, but they were voted down. > Is coar@apache.org a personal mailbox (due to mail file > attachments). Yes, that's one of my personal mailboxes. -- #ken P-)} Ken Coar Apache Group member "Apache Server for Dummies" From: "Rainer Scherg," To: "'coar@apache.org'" Cc: "'apbugs@apache.org'" Subject: mod_log-any/3449: RefererIgnore in CustomLog Date: Thu, 26 Nov 1998 15:39:03 +0100 Hi! As requested by Ken, here comes the diff-file... The enhancements are roughly tested (running on our intranet since yesterday - 'til now it's working fine). Work done: - AgentLog fn (compatability with mod_log_agent) - RefererLog fn (compatability with mod_log_referer) - RefererIgnore str str ... ( - " - , slightly behavior) - RefererReplace str repl (new function at no add. cost) - Predefined log format ELF (please have a look on def.!) Still to be done: - Doc (HTML) >> diff -u mod_log_config.c.cvs-tree mod_log_config.new.c Please send feedback if question or problem. === Rainer === snip ======= --- mod_log_config.c.cvs-tree Thu Nov 26 14:34:11 1998 +++ mod_log_config.c Thu Nov 26 14:32:40 1998 @@ -75,6 +75,27 @@ * CookieLog fn For backwards compatability with old Cookie * logging module - now deprecated. * + *--- Added 1998-11-24: + * RefererLog fn For backwards compat. with old log_referer module + * RefererIgnore string1 string2 ... + * Avoid logging of referer information, + * if string is found in referer. + * (same as RefererReplace string - ) + * Remark: RefererLog logs "-" instead of nothing!! + * RefererReplace string repl + * Replace referer information with repl, if + * string is found in referer. + * AgentLog fn For backwards compat. with log_agent module + * + * + * RefererReplace could be used to abstract referer information + * from e.g. a specific host, if a detailed URL information isn't + * needed. E.g. mark referer information of your own server as "Local": + * RefererReplace www.mydomain.com "Local" + * RefererReplace www.mypartner.de "Linked_From_My_Partner" + * This could minimize the size of logfiles and abstract information + * for e.g. statistic tools. + *--- * There can be any number of TransferLog and CustomLog * commands. Each request will be logged to _ALL_ the * named files, in the appropriate format. @@ -106,7 +127,7 @@ * CustomLog logs/referer "%{referer}i -> %U" * CustomLog logs/agent "%{user-agent}i" * - * Except: no RefererIgnore functionality + * Except: * logs '-' if no Referer or User-Agent instead of nothing * * But using this method allows much easier modification of the @@ -163,9 +184,31 @@ * server. If it doesn't have its own TransferLog, it writes to the * same descriptor (meaning the same process for "| ..."). * - * --- rst */ + * --- rst + * + * 1998-11-24 Rainer.Scherg@t-online.de + * - merge mod_log_referer/mod_log_agent -> mod_log_config.c + * - AgentLog directive (for backwards compat.) + * - RefererLog directive (for backwards compat.) + * - RefererIgnore directive + * - RefererReplace directive (new) + * - Add predefined format nickname ELF + * ToDo: Bind RefererIgnore/Replace to format nickname + * (necessary?) + */ + #define DEFAULT_LOG_FORMAT "%h %l %u %t \"%r\" %>s %b" +/* $$$ see: http://www.netstore.de/Supply/http-analyze/manual/index.html + $$$ abrevations of logfilenames used by some statistic tools... + $$$ Standard predefined names makes life easy for non-expierienced users +*/ +#define COMBINED_LOG_FORMAT "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\"" +#define EXTENDED_LOG_FORMAT "%h %l %u %t \"%r\" %s %b \"%{User-agent}i\" \"%{Referer}i\"" +#define REFERER_LOG_FORMAT "%{referer}i -> %U" +#define AGENT_LOG_FORMAT "%{user-agent}i" +#define COOKIE_LOG_FORMAT "%{Cookie}n \"%r\" %t" + #include "httpd.h" #include "http_config.h" @@ -173,8 +216,16 @@ #include "http_log.h" #include + module MODULE_VAR_EXPORT config_log_module; + +static const char *check_referer_replace(const char *referer, + array_header *referer_replace_list, request_rec *orig); + + + + static int xfer_flags = (O_WRONLY | O_APPEND | O_CREAT); #if defined(OS2) || defined(WIN32) /* OS/2 dosen't support users and groups */ @@ -213,14 +264,24 @@ * which might be empty. */ + +typedef struct { + char *str; /* search for str */ + char *repl; /* replace with */ +} replace_entry; + typedef struct { char *default_format_string; array_header *default_format; array_header *config_logs; array_header *server_config_logs; table *formats; + array_header *referer_replace_list; } multi_log_state; + + + /* * config_log_state holds the status of a single log file. fname might * be NULL, which means this module does no logging for this @@ -234,6 +295,7 @@ char *fname; char *format_string; array_header *format; + array_header *referer_replace_list; int log_fd; #ifdef BUFFERED_LOGS int outcnt; @@ -256,6 +318,7 @@ array_header *conditions; } log_format_item; + static char *format_integer(pool *p, int i) { return ap_psprintf(p, "%d", i); @@ -364,6 +427,7 @@ { return ap_table_get(r->notes, a); } + static const char *log_env_var(request_rec *r, char *a) { return ap_table_get(r->subprocess_env, a); @@ -637,7 +701,7 @@ */ static const char *process_item(request_rec *r, request_rec *orig, - log_format_item *item) + log_format_item *item, config_log_state *cls) { const char *cp; @@ -664,6 +728,17 @@ /* We do. Do it... */ cp = (*item->func) (item->want_orig ? orig : r, item->arg); + + /* Referer handling + * - RefererReplace + */ + + if (cp && (*item->func == log_header_in) + && item->arg && !strcasecmp (item->arg,"referer")) { + + cp = check_referer_replace (cp,cls->referer_replace_list,orig); + } + return cp ? cp : "-"; } @@ -708,7 +783,7 @@ } for (i = 0; i < format->nelts; ++i) { - strs[i] = process_item(r, orig, &items[i]); + strs[i] = process_item(r, orig, &items[i], cls); } for (i = 0; i < format->nelts; ++i) { @@ -785,11 +860,13 @@ multi_log_state *mls = (multi_log_state *) ap_palloc(p, sizeof(multi_log_state)); mls->config_logs = ap_make_array(p, 1, sizeof(config_log_state)); + mls->referer_replace_list = ap_make_array(p, 4, sizeof(replace_entry)); mls->default_format_string = NULL; mls->default_format = NULL; mls->server_config_logs = NULL; mls->formats = ap_make_table(p, 4); ap_table_setn(mls->formats, "CLF", DEFAULT_LOG_FORMAT); + ap_table_setn(mls->formats, "ELF", EXTENDED_LOG_FORMAT); return mls; } @@ -812,6 +889,10 @@ } add->formats = ap_overlay_tables(p, base->formats, add->formats); + /* $$$ Something to Do for RefererReplace? + $$$ Inherit from master server? + */ + return add; } @@ -861,7 +942,7 @@ cls->format = parse_log_string(cmd->pool, fmt, &err_string); } cls->log_fd = -1; - + cls->referer_replace_list = mls->referer_replace_list; return err_string; } @@ -872,9 +953,85 @@ static const char *set_cookie_log(cmd_parms *cmd, void *dummy, char *fn) { - return add_custom_log(cmd, dummy, fn, "%{Cookie}n \"%r\" %t"); + return add_custom_log(cmd, dummy, fn, COOKIE_LOG_FORMAT); } + +/* Backward compatibility (mod_log_referer.c, mod_log_agent.c) + * - RefererLog file + * Log referer information "uri -> document" to file + * - RefererIgnore host [host] ... + * Exclude RefererInfo from Log + * - RefererReplace string replacestring + * Replace RefererInfo + * - AgentLog file + */ + +static const char *set_referer_log(cmd_parms *cmd, void *dummy, char *fn) +{ + return add_custom_log(cmd, dummy, fn, REFERER_LOG_FORMAT); +} + + +static const char *set_agent_log(cmd_parms *cmd, void *dummy, char *fn) +{ + return add_custom_log(cmd, dummy, fn, AGENT_LOG_FORMAT); +} + + + +static const char *add_referer_replace(cmd_parms *cmd, void *dummy, char *str, + char *repl) +{ + replace_entry *new_re; + multi_log_state *mls = ap_get_module_config(cmd->server->module_config, + &config_log_module); + + /* RefererReplace string replacestring */ + + if (str && repl) { + ap_str_tolower(str); + new_re = ap_push_array(mls->referer_replace_list); + + new_re->str = str; + new_re->repl = repl; + } + return NULL; +} + + +static const char *add_referer_ignore(cmd_parms *parms, void *dummy, char *arg) +{ + return add_referer_replace (parms,dummy,arg,"-"); +} + + +/* + * Check if referer information should be replaced with repl-string. + */ + +static const char *check_referer_replace(const char *referer, + array_header *r_list, request_rec *orig) + +{ + replace_entry *r_entry,*rp; + char *referertest; + int i; + + + referertest = ap_pstrdup(orig->pool, referer); + ap_str_tolower(referertest); + + r_entry = (replace_entry *) r_list->elts; + for (i=0; i < r_list->nelts; i++) { + rp = &r_entry[i]; + if (strstr(referertest, rp->str)) return (const char *)rp->repl; + } + + return referer; +} + + static const command_rec config_log_cmds[] = { {"CustomLog", add_custom_log, NULL, RSRC_CONF, TAKE2, @@ -885,8 +1042,17 @@ "a log format string (see docs) and an optional format name"}, {"CookieLog", set_cookie_log, NULL, RSRC_CONF, TAKE1, "the filename of the cookie log"}, + {"RefererLog", set_referer_log, NULL, RSRC_CONF, TAKE1, + "the filename of the referer log"}, + {"RefererIgnore", add_referer_ignore, NULL, RSRC_CONF, ITERATE, + "referer hostnames to ignore"}, + {"RefererReplace", add_referer_replace, NULL, RSRC_CONF, TAKE2, + "referer hostnames to ignore"}, + {"AgentLog", set_agent_log, NULL, RSRC_CONF, TAKE1, + "the filename of the agent log"}, {NULL} }; + static config_log_state *open_config_log(server_rec *s, pool *p, config_log_state *cls, State-Changed-From-To: suspended-closed State-Changed-By: coar State-Changed-When: Thu Feb 18 04:14:04 PST 1999 State-Changed-Why: Conditional logging has been added for the next release after 1.3.4. Thanks for your patience, and for using Apache! >Unformatted: [In order for any reply to be added to the PR database, ] [you need to include in the Cc line ] [and leave the subject line UNCHANGED. This is not done] [automatically because of the potential for mail loops. ] [If you do not include this Cc, your reply may be ig- ] [nored unless you are responding to an explicit request ] [from a developer. ] [Reply only with text; DO NOT SEND ATTACHMENTS! ]