From: Junio C Hamano <gitster@pobox.com>
To: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Git List <git@vger.kernel.org>,
Jonathan Nieder <jrnieder@gmail.com>,
David Barr <david.barr@cordelta.com>,
Sverre Rabbelier <srabbelier@gmail.com>
Subject: Re: [PATCH 2/5] vcs-svn: Start working on the dumpfile producer
Date: Fri, 21 Jan 2011 16:30:18 -0800 [thread overview]
Message-ID: <7v39olok4l.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: 1295415899-1192-3-git-send-email-artagnon@gmail.com
Ramkumar Ramachandra <artagnon@gmail.com> writes:
> Start off with some broad design sketches. Compile succeeds, but
> parser is incorrect. Include a Makefile rule to build it into
> vcs-svn/lib.a.
>
> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
I initially thought I should refrain from doing a usual picky review for
contrib/ material, but realized this does not go to contrib/, so...
> diff --git a/vcs-svn/dump_export.c b/vcs-svn/dump_export.c
> new file mode 100644
> index 0000000..04ede06
> --- /dev/null
> +++ b/vcs-svn/dump_export.c
> @@ -0,0 +1,73 @@
> +/*
> + * Licensed under a two-clause BSD-style license.
> + * See LICENSE for details.
> + */
> +
> +#include "git-compat-util.h"
> +#include "strbuf.h"
> +#include "line_buffer.h"
> +#include "dump_export.h"
> +
> +void dump_export_begin_rev(int revision, const char *revprops,
> + int prop_len) {
Style. "{" at the beginning of the next line (same everywhere).
> +void dump_export_node(const char *path, enum node_kind kind,
> + enum node_action action, unsigned long text_len,
> + unsigned long copyfrom_rev, const char *copyfrom_path) {
> + printf("Node-path: %s\n", path);
> + printf("Node-kind: ");
> + switch (action) {
> + case NODE_KIND_NORMAL:
> + printf("file\n");
> + break;
> + case NODE_KIND_EXECUTABLE:
> + printf("file\n");
> + break;
> + case NODE_KIND_SYMLINK:
> + printf("file\n");
> + break;
> + case NODE_KIND_GITLINK:
> + printf("file\n");
> + break;
> + case NODE_KIND_SUBDIR:
> + die("Unsupported: subdirectory");
> + default:
> + break;
> + }
Hmph, is it expected that the repertoire of node-kind stay the same, or
will it be extended over time? It might make sense to change the above
switch a more table-driven one. The same for node-action that follows
this part of the code.
> diff --git a/vcs-svn/svnload.c b/vcs-svn/svnload.c
> new file mode 100644
> index 0000000..7043ae7
> --- /dev/null
> +++ b/vcs-svn/svnload.c
> @@ -0,0 +1,294 @@
> ...
> +#define SVN_DATE_FORMAT "%Y-%m-%dT%H:%M:%S.000000Z"
> +#define SVN_DATE_LEN 28
> +#define LENGTH_UNKNOWN (~0)
> +
> +static struct line_buffer input = LINE_BUFFER_INIT;
> +static struct strbuf blobs[100];
> +
Is there a rationale for "100", or is it just a random number that can be
tweakable at the source level? Would we want to have a symbolic constant
for it?
The "blank" line has a trailing whitespace.
> +static struct {
> + unsigned long prop_len, text_len, copyfrom_rev, mark;
> + int text_delta, prop_delta; /* Boolean */
> + enum node_action action;
> + enum node_kind kind;
> + struct strbuf copyfrom_path, path;
> +} node_ctx;
> + ...
> +static void reset_node_ctx(void)
> +{
> + node_ctx.prop_len = LENGTH_UNKNOWN;
> + node_ctx.text_len = LENGTH_UNKNOWN;
> + node_ctx.mark = 0;
> + node_ctx.copyfrom_rev = 0;
> + node_ctx.text_delta = -1;
> + node_ctx.prop_delta = -1;
Does your "Boolean" type take values 0 or -1? Or is it perhaps a tristate
false=0, true=1, unknown=-1? If so, the comment on the type above needs
to be fixed.
> + strbuf_reset(&node_ctx.copyfrom_path);
> + strbuf_reset(&node_ctx.path);
> +}
> +
> +static void populate_props(struct strbuf *props, const char *author,
> + const char *log, const char *date) {
Style on "{" (look everywhere).
> + strbuf_reset(props);
Trailing whitespace.
> +static void parse_author_line(char *val, struct strbuf *name,
> + struct strbuf *email, struct strbuf *date) {
> + char *t, *tz_off;
> + char time_buf[SVN_DATE_LEN];
> + const struct tm *tm_time;
> +
> + /* Simon Hausmann <shausman@trolltech.com> 1170199019 +0100 */
> + strbuf_reset(name);
> + strbuf_reset(email);
> + strbuf_reset(date);
> + tz_off = strrchr(val, ' ');
> + *tz_off++ = '\0';
> + t = strrchr(val, ' ');
> + *(t - 1) = '\0'; /* Ignore '>' from email */
> + t ++;
Unwanted SP before "++" (look everywhere).
> + tm_time = time_to_tm(strtoul(t, NULL, 10), atoi(tz_off));
Has your caller already verified tz_off is a reasonable integer?
> +void svnload_read(void) {
> ...
> + switch (val - t - 1) {
> + case 1:
> + if (!memcmp(t, "D", 1)) {
> + node_ctx.action = NODE_ACTION_DELETE;
> + }
> + else if (!memcmp(t, "C", 1)) {
Style: "} else if (...) {".
> + node_ctx.action = NODE_ACTION_ADD;
> + }
> + else if (!memcmp(t, "R", 1)) {
> + node_ctx.action = NODE_ACTION_REPLACE;
> + }
> + else if (!memcmp(t, "M", 1)) {
> + node_ctx.action = NODE_ACTION_CHANGE;
> + mode_incr = 7;
> + if (!memcmp(val, "100644", 6))
> + node_ctx.kind = NODE_KIND_NORMAL;
> + else if (!memcmp(val, "100755", 6))
> + node_ctx.kind = NODE_KIND_EXECUTABLE;
> + else if (!memcmp(val, "120000", 6))
> + node_ctx.kind = NODE_KIND_SYMLINK;
> + else if (!memcmp(val, "160000", 6))
> + node_ctx.kind = NODE_KIND_GITLINK;
> + else if (!memcmp(val, "040000", 6))
Is <mode> guaranteed to be a 6-digit octal, padded with 0 to the left?
The manual page of git-fast-import seems to hint that is the case, but I
am double checking, as the leading zero is an error in the tree object.
> + node_ctx.kind = NODE_KIND_SUBDIR;
> + else {
> + if (!memcmp(val, "755", 3))
> + node_ctx.kind = NODE_KIND_EXECUTABLE;
> + else if(!memcmp(val, "644", 3))
Style; missing SP after "if/switch" (look everywhere).
> + node_ctx.kind = NODE_KIND_NORMAL;
> + else
> + die("Unrecognized mode: %s", val);
> + mode_incr = 4;
> + }
> + val += mode_incr;
Hmm, this whole thing is ugly.
Perhaps a table-lookup, or at least a helper function that translates
"val" to node-kind (while advancing val as the side effect) may make it
easier to read?
> + t = strchr(val, ' ');
> + *t++ = '\0';
> + strbuf_reset(&node_ctx.path);
> + strbuf_add(&node_ctx.path, t, strlen(t));
> + if (!memcmp(val + 1, "inline", 6))
> + die("Unsupported dataref: inline");
> + else if (*val == ':')
> + to_dump = &blobs[strtoul(val + 1, NULL, 10)];
Has anybody so far verified the decimal number at (val+1) is a reasonable
value, or am I seeing a future CVE here?
Do we care what follows the len of digits on this line? Does a line in G-F-I
stream allow trailing garbage (this question is not limited to this part
of the code)?
> +int svnload_init(const char *filename)
> +{
> + int i;
> + if (buffer_init(&input, filename))
> + return error("cannot open %s: %s", filename, strerror(errno));
> + active_ctx = UNKNOWN_CTX;
> + strbuf_init(&rev_ctx.props, MAX_GITSVN_LINE_LEN);
> + ...
> + strbuf_init(&node_ctx.copyfrom_path, MAX_GITSVN_LINE_LEN);
> + for (i = 0; i < 100; i ++)
> + strbuf_init(&blobs[i], 10000);
> + return 0;
> +}
> +
> +void svnload_deinit(void)
> +{
> + int i;
> + reset_rev_ctx(0);
> + reset_node_ctx();
> + strbuf_release(&rev_ctx.props);
> + ...
> + strbuf_release(&node_ctx.copyfrom_path);
> + for (i = 0; i < 100; i ++)
> + strbuf_release(&blobs[i]);
> + if (buffer_deinit(&input))
> + fprintf(stderr, "Input error\n");
> + if (ferror(stdout))
> + fprintf(stderr, "Output error\n");
> +}
Whoever needs to add an element to rev_ctx must consistently touch reset,
init and deinit. Would it help him/her if you moved the code so that
these functions are close together from the beginning?
next prev parent reply other threads:[~2011-01-22 0:30 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-19 5:44 [RFC PATCH v2 0/5] Towards a Git-to-SVN bridge Ramkumar Ramachandra
2011-01-19 5:44 ` [PATCH 1/5] date: Expose the time_to_tm function Ramkumar Ramachandra
2011-01-19 5:44 ` [PATCH 2/5] vcs-svn: Start working on the dumpfile producer Ramkumar Ramachandra
2011-01-22 0:30 ` Junio C Hamano [this message]
2011-01-22 9:45 ` Ramkumar Ramachandra
2011-01-19 5:44 ` [PATCH 3/5] Build an svn-fi target in contrib/svn-fe Ramkumar Ramachandra
2011-01-19 5:44 ` [PATCH 4/5] fast-export: Introduce --inline-blobs Ramkumar Ramachandra
2011-01-19 19:50 ` Junio C Hamano
2011-01-19 21:48 ` Jonathan Nieder
2011-01-20 4:50 ` Ramkumar Ramachandra
2011-01-20 5:48 ` Jonathan Nieder
2011-01-20 6:28 ` Ramkumar Ramachandra
2011-01-20 13:53 ` Drew Northup
2011-01-22 9:24 ` Ramkumar Ramachandra
2011-01-22 19:18 ` Jonathan Nieder
2011-01-20 5:41 ` Jonathan Nieder
2011-01-22 0:30 ` Junio C Hamano
2011-01-19 5:44 ` [PATCH 5/5] vcs-svn: Add dir_cache for svnload Ramkumar Ramachandra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7v39olok4l.fsf@alter.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=artagnon@gmail.com \
--cc=david.barr@cordelta.com \
--cc=git@vger.kernel.org \
--cc=jrnieder@gmail.com \
--cc=srabbelier@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).