git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Migration of git-scm.com to a static web site: ready for review/testing
@ 2023-11-17 13:25 Johannes Schindelin
  2023-11-17 16:26 ` Todd Zullinger
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Johannes Schindelin @ 2023-11-17 13:25 UTC (permalink / raw
  To: git; +Cc: Matt Burke, Victoria Dye, Matthias Aßhauer

[-- Attachment #1: Type: text/plain, Size: 1484 bytes --]

Hi,

the idea of migrating https://git-scm.com/ from a Rails app to a static
site has been discussed several times on this list in the past.

Thanks to the heroic, multi-year efforts of Matt Burke, Victoria Dye and
Matthias Aßhauer, there is now a Pull Request, ready for review:
https://github.com/git/git-scm.com/pull/1804

This Pull Request is not for the faint of heart, mainly because of the
sheer amount of generated pages that are committed to the repository (such
as the book, the manual pages, etc, a design decision necessary to run
this as a static website).

These pages are generated by GitHub workflows that are intended to run on
a schedule, and the scripts that generate them are part of the Pull
Request. For that reason, I do not consider it necessary to review those
generated pages, those reviews have been done in the upstream sources from
which the pages were generated.

At this point, the patches are fairly robust and I am mainly hoping for
help with verifying that the static site works as intended, that existing
links will continue to work with the new site (essentially, find obscure
references to the existing website, then insert `git.github.io/` in the
URL and verify that it works as intended).

To that end, I deployed this branch to GitHub Pages so that anyone
interested (hopefully many!) can have a look at
https://git.github.io/git-scm.com/ and compare to the existing
https://git-scm.com/.

Thank you,
Johannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration of git-scm.com to a static web site: ready for review/testing
  2023-11-17 13:25 Migration of git-scm.com to a static web site: ready for review/testing Johannes Schindelin
@ 2023-11-17 16:26 ` Todd Zullinger
  2023-11-18  1:14   ` Johannes Schindelin
  2023-11-18  9:41 ` Johannes Sixt
  2023-11-23 18:53 ` Kaartic Sivaraam
  2 siblings, 1 reply; 9+ messages in thread
From: Todd Zullinger @ 2023-11-17 16:26 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: git, Matt Burke, Victoria Dye, Matthias Aßhauer

Hello,

Johannes Schindelin wrote:
> At this point, the patches are fairly robust and I am mainly hoping for
> help with verifying that the static site works as intended, that existing
> links will continue to work with the new site (essentially, find obscure
> references to the existing website, then insert `git.github.io/` in the
> URL and verify that it works as intended).
> 
> To that end, I deployed this branch to GitHub Pages so that anyone
> interested (hopefully many!) can have a look at
> https://git.github.io/git-scm.com/ and compare to the existing
> https://git-scm.com/.

This is nice.  Thanks to all for working on it!

For checking links, a tool like linkcheker[1] is very handy.
This is run against the local docs in the Fedora package
builds to catch broken links.

I ran it against the test site and it turned up _a lot_ of
broken links.  It's enough that saving and sharing the
output is probably more work than having someone familiar
with the migration give it a run directly.

I ran `linkchecker https://git.github.io/git-scm.com/` and
the eventual result was:

  That's it. 13459 links in 14126 URLs checked. 0 warnings found. 6763 errors found.
  Stopped checking at 2023-11-17 11:11:17-004 (1 hour, 19 minutes)

The default output reports failures in a format like this:

  URL        `ch00/ch10-git-internals'
  Name       `Git Internals'
  Parent URL https://git.github.io/git-scm.com/book/tr/v2/Ek-b%C3%B6l%C3%BCm-C:-Git-Commands-Plumbing-Commands/, line 106, col 1318
  Real URL   https://git.github.io/git-scm.com/book/tr/v2/Ek-b%C3%B6l%C3%BCm-C:-Git-Commands-Plumbing-Commands/ch00/ch10-git-internals
  Check time 3.303 seconds
  Size       1KB
  Result     Error: 404 Not Found

LinkChecker can be run in a mode which directs the failures
to a file.  That would be more like:

  linkchecker -F text/utf_8//tmp/git-scm-check.txt https://git.github.io/git-scm.com/

The format of the -F option is TYPE[/ENCODING][/FILENAME]
where TYPE can be text, html, sql, csv, gml, dot, xml,
sitemap, none or failures.  The failures type is much more
terse:

  1 "('https://git.github.io/git-scm.com/book/en/v2/Appendix-C:-Git-Commands-Plumbing-Commands/', 'https://git.github.io/git-scm.com/book/en/v2/Appendix-C:-Git-Commands-Plumbing-Commands/ch00/ch10-git-internals')"

I found the text type much more helpful in quickly spot
checking some of the failures since it includes the text
string used for the link.

Running it against a local directory of the content would be
a lot faster, if that's an option.  It's also worth bumping
the default number of threads from 10 to increase the speed
a bit.

[1] https://linkchecker.github.io/linkchecker/

-- 
Todd


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration of git-scm.com to a static web site: ready for review/testing
  2023-11-17 16:26 ` Todd Zullinger
@ 2023-11-18  1:14   ` Johannes Schindelin
  2023-11-18  2:57     ` Todd Zullinger
  0 siblings, 1 reply; 9+ messages in thread
From: Johannes Schindelin @ 2023-11-18  1:14 UTC (permalink / raw
  To: Todd Zullinger; +Cc: git, Matt Burke, Victoria Dye, Matthias Aßhauer

[-- Attachment #1: Type: text/plain, Size: 3330 bytes --]

Hi Todd,

On Fri, 17 Nov 2023, Todd Zullinger wrote:

> Johannes Schindelin wrote:
> > At this point, the patches are fairly robust and I am mainly hoping for
> > help with verifying that the static site works as intended, that existing
> > links will continue to work with the new site (essentially, find obscure
> > references to the existing website, then insert `git.github.io/` in the
> > URL and verify that it works as intended).
> >
> > To that end, I deployed this branch to GitHub Pages so that anyone
> > interested (hopefully many!) can have a look at
> > https://git.github.io/git-scm.com/ and compare to the existing
> > https://git-scm.com/.
>
> This is nice.  Thanks to all for working on it!

😊

> For checking links, a tool like linkcheker[1] is very handy.
> This is run against the local docs in the Fedora package
> builds to catch broken links.

Hmm, `linkchecker` is really slow for me, even locally.

> I ran it against the test site and it turned up _a lot_ of
> broken links.  [...]
>
>   URL        `ch00/ch10-git-internals'
>   Name       `Git Internals'
>   Parent URL https://git.github.io/git-scm.com/book/tr/v2/Ek-b%C3%B6l%C3%BCm-C:-Git-Commands-Plumbing-Commands/, line 106, col 1318
>   Real URL   https://git.github.io/git-scm.com/book/tr/v2/Ek-b%C3%B6l%C3%BCm-C:-Git-Commands-Plumbing-Commands/ch00/ch10-git-internals
>   Check time 3.303 seconds
>   Size       1KB
>   Result     Error: 404 Not Found

Good catch. I totally forgot to take care of the cross-references!

This is now fixed, as of
https://github.com/dscho/git-scm.com/commit/e599a57b2fadf8cb01e57af23fcb929b32e94bcb

I kicked off the GitHub workflow to re-generate the books, and the updated
GitHub Pages look fine (see e.g. the parent URL mentioned above and follow
the "Pull Request Refs" link).

> Running it against a local directory of the content would be
> a lot faster, if that's an option.  It's also worth bumping
> the default number of threads from 10 to increase the speed
> a bit.
>
> [1] https://linkchecker.github.io/linkchecker/

Unfortunately it is actually quite slow.

Granted, the added cross-references now increase the number of hyperlinks
to check, but after I let the program run for a bit over an hour to look
at https://git-scm.com/ (for comparison), it is now running on the local
build (i.e. the `public/` folder generated by Hugo, not even an HTTP
server) for over 45 minutes and still not done:

-- snip --
[...]
10 threads active, 112977 links queued, 206443 links in 100001 URLs checked, runtime 48 minutes, 46 seconds
10 threads active, 113455 links queued, 206689 links in 100001 URLs checked, runtime 48 minutes, 52 seconds
10 threads active, 113829 links queued, 206874 links in 100001 URLs checked, runtime 48 minutes, 57 seconds
10 threads active, 114230 links queued, 207136 links in 100001 URLs checked, runtime 49 minutes, 3 seconds
10 threads active, 114731 links queued, 207498 links in 100001 URLs checked, runtime 49 minutes, 9 seconds
-- snap --

Maybe something is going utterly wrong because the number of links seems
to be dramatically larger than what the https://git-scm.com/ reported;
Maybe linkchecker broke out of the `public/` directory and now indexes my
entire harddrive ;-)

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration of git-scm.com to a static web site: ready for review/testing
  2023-11-18  1:14   ` Johannes Schindelin
@ 2023-11-18  2:57     ` Todd Zullinger
  2023-11-21 14:25       ` Johannes Schindelin
  0 siblings, 1 reply; 9+ messages in thread
From: Todd Zullinger @ 2023-11-18  2:57 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: git, Matt Burke, Victoria Dye, Matthias Aßhauer

Hi Johannes,

Johannes Schindelin wrote:
>> For checking links, a tool like linkcheker[1] is very handy.
>> This is run against the local docs in the Fedora package
>> builds to catch broken links.
> 
> Hmm, `linkchecker` is really slow for me, even locally.

Yeah, it took an hour and a half to run for me, both on an
old laptop and a fast server with plenty of threads,
bandwidth, and memory.

Checking the git HTML documentation takes under 30 seconds,
which is largely the only place I've used it.  It has been
very helpful in catching broken links in the docs during the
build and the time is short enough that I never minded.

> Granted, the added cross-references now increase the number of hyperlinks
> to check, but after I let the program run for a bit over an hour to look
> at https://git-scm.com/ (for comparison), it is now running on the local
> build (i.e. the `public/` folder generated by Hugo, not even an HTTP
> server) for over 45 minutes and still not done:
> 
> -- snip --
> [...]
> 10 threads active, 112977 links queued, 206443 links in 100001 URLs checked, runtime 48 minutes, 46 seconds
> 10 threads active, 113455 links queued, 206689 links in 100001 URLs checked, runtime 48 minutes, 52 seconds
> 10 threads active, 113829 links queued, 206874 links in 100001 URLs checked, runtime 48 minutes, 57 seconds
> 10 threads active, 114230 links queued, 207136 links in 100001 URLs checked, runtime 49 minutes, 3 seconds
> 10 threads active, 114731 links queued, 207498 links in 100001 URLs checked, runtime 49 minutes, 9 seconds
> -- snap --

I would have thought that bumping the number of threads a
lot would really help, but I ran it on a dual Xeon system
with 40 threads and it took about the same time.  Perhaps I
should have increased to double or more the system processor
count.

> Maybe something is going utterly wrong because the number
> of links seems to be dramatically larger than what the
> https://git-scm.com/ reported; Maybe linkchecker broke out
> of the `public/` directory and now indexes my entire
> harddrive ;-)

Heh, hopefully not. :)

I wondered if there were circular links that it was picking
up and not de-duplicating.  I may try to run it with the
--verbose option which logs all checked URLs.  Maybe that
will turn up something.  It sure seems like there's a _lot_
of links here.

There is a --recursion-level option which might be helpful.
The --ignore-url and/or --no-follow-url may also be useful.

Though even if it's (very) slow, it might be worth running
to flush out some initial issues before making the site
live.  Letting it run in the background for a few hours is
probably less effort than fielding a number of big reports
about broken URL here and there. :)

Of course, it would be even better if it were fast enough to
run as part of the site build process to catch broken links
before each deployment, but that would need to be measured
in some relatively small number of seconds instead of the
hours it seems to take now. :/

-- 
Todd


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration of git-scm.com to a static web site: ready for review/testing
  2023-11-17 13:25 Migration of git-scm.com to a static web site: ready for review/testing Johannes Schindelin
  2023-11-17 16:26 ` Todd Zullinger
@ 2023-11-18  9:41 ` Johannes Sixt
  2023-11-18  9:46   ` Johannes Schindelin
  2023-11-23 18:53 ` Kaartic Sivaraam
  2 siblings, 1 reply; 9+ messages in thread
From: Johannes Sixt @ 2023-11-18  9:41 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: git, Matt Burke, Victoria Dye, Matthias Aßhauer

Am 17.11.23 um 14:25 schrieb Johannes Schindelin:
> Hi,
> 
> the idea of migrating https://git-scm.com/ from a Rails app to a static
> site has been discussed several times on this list in the past.
> 
> Thanks to the heroic, multi-year efforts of Matt Burke, Victoria Dye and
> Matthias Aßhauer, there is now a Pull Request, ready for review:
> https://github.com/git/git-scm.com/pull/1804
> 
> This Pull Request is not for the faint of heart, mainly because of the
> sheer amount of generated pages that are committed to the repository (such
> as the book, the manual pages, etc, a design decision necessary to run
> this as a static website).
> 
> These pages are generated by GitHub workflows that are intended to run on
> a schedule, and the scripts that generate them are part of the Pull
> Request. For that reason, I do not consider it necessary to review those
> generated pages, those reviews have been done in the upstream sources from
> which the pages were generated.
> 
> At this point, the patches are fairly robust and I am mainly hoping for
> help with verifying that the static site works as intended, that existing
> links will continue to work with the new site (essentially, find obscure
> references to the existing website, then insert `git.github.io/` in the
> URL and verify that it works as intended).
> 
> To that end, I deployed this branch to GitHub Pages so that anyone
> interested (hopefully many!) can have a look at
> https://git.github.io/git-scm.com/ and compare to the existing
> https://git-scm.com/.

When a transition to static pages happens, an important aspect is that
external links that point into git-scm.com must not be invalidated.
There are many such links in Stackoverflow answers, for example.

I checked one link:

https://git-scm.com/docs/git-rebase#Documentation/git-rebase.txt--r
https://git.github.io/git-scm.com/docs/git-rebase#Documentation/git-rebase.txt--r

and it is looking very good. Thank you! I assume that keeping links
working is not just a happy accident, but part of the plan.

-- Hannes



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration of git-scm.com to a static web site: ready for review/testing
  2023-11-18  9:41 ` Johannes Sixt
@ 2023-11-18  9:46   ` Johannes Schindelin
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Schindelin @ 2023-11-18  9:46 UTC (permalink / raw
  To: Johannes Sixt; +Cc: git, Matt Burke, Victoria Dye, Matthias Aßhauer

Hi Hannes,

Yes, keeping existing deep links working is very much a goal of this work.

Thank you,
Johannes


-------- Original Message --------
From: Johannes Sixt <j6t@kdbg.org>
Sent: November 18, 2023 10:41:11 AM GMT+01:00
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: git@vger.kernel.org, Matt Burke <spraints@gmail.com>, Victoria Dye <vdye@github.com>, "Matthias Aßhauer" <mha1993@live.de>
Subject: Re: Migration of git-scm.com to a static web site: ready for review/testing

Am 17.11.23 um 14:25 schrieb Johannes Schindelin:
> Hi,
> 
> the idea of migrating https://git-scm.com/ from a Rails app to a static
> site has been discussed several times on this list in the past.
> 
> Thanks to the heroic, multi-year efforts of Matt Burke, Victoria Dye and
> Matthias Aßhauer, there is now a Pull Request, ready for review:
> https://github.com/git/git-scm.com/pull/1804
> 
> This Pull Request is not for the faint of heart, mainly because of the
> sheer amount of generated pages that are committed to the repository (such
> as the book, the manual pages, etc, a design decision necessary to run
> this as a static website).
> 
> These pages are generated by GitHub workflows that are intended to run on
> a schedule, and the scripts that generate them are part of the Pull
> Request. For that reason, I do not consider it necessary to review those
> generated pages, those reviews have been done in the upstream sources from
> which the pages were generated.
> 
> At this point, the patches are fairly robust and I am mainly hoping for
> help with verifying that the static site works as intended, that existing
> links will continue to work with the new site (essentially, find obscure
> references to the existing website, then insert `git.github.io/` in the
> URL and verify that it works as intended).
> 
> To that end, I deployed this branch to GitHub Pages so that anyone
> interested (hopefully many!) can have a look at
> https://git.github.io/git-scm.com/ and compare to the existing
> https://git-scm.com/.

When a transition to static pages happens, an important aspect is that
external links that point into git-scm.com must not be invalidated.
There are many such links in Stackoverflow answers, for example.

I checked one link:

https://git-scm.com/docs/git-rebase#Documentation/git-rebase.txt--r
https://git.github.io/git-scm.com/docs/git-rebase#Documentation/git-rebase.txt--r

and it is looking very good. Thank you! I assume that keeping links
working is not just a happy accident, but part of the plan.

-- Hannes



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration of git-scm.com to a static web site: ready for review/testing
  2023-11-18  2:57     ` Todd Zullinger
@ 2023-11-21 14:25       ` Johannes Schindelin
  2023-11-28  1:54         ` Todd Zullinger
  0 siblings, 1 reply; 9+ messages in thread
From: Johannes Schindelin @ 2023-11-21 14:25 UTC (permalink / raw
  To: Todd Zullinger; +Cc: git, Matt Burke, Victoria Dye, Matthias Aßhauer

[-- Attachment #1: Type: text/plain, Size: 4881 bytes --]

Hi Todd,

On Fri, 17 Nov 2023, Todd Zullinger wrote:

> Johannes Schindelin wrote:
> >> For checking links, a tool like linkcheker[1] is very handy.
> >> This is run against the local docs in the Fedora package
> >> builds to catch broken links.
> >
> > Hmm, `linkchecker` is really slow for me, even locally.
>
> Yeah, it took an hour and a half to run for me, both on an
> old laptop and a fast server with plenty of threads,
> bandwidth, and memory.
>
> Checking the git HTML documentation takes under 30 seconds,
> which is largely the only place I've used it.  It has been
> very helpful in catching broken links in the docs during the
> build and the time is short enough that I never minded.

I found https://lychee.cli.rs/#/ in the meantime and figured out how to
use it in a local setup:

First, I run:

	HUGO_TIMEOUT=777 HUGO_BASEURL= HUGO_UGLYURLS=false time hugo

The first `HUGO_*` setting is to make sure that even though I sometimes
use all of the cores of my laptop's CPU it should not fail. The other two
are to override settings from `hugo.yml` so that `lychee` can handle the
output (`lychee` will not auto-append `.html`, unlike GitHub Pages, and
would therefore mis-detect tons of broken links, without
`HUGO_UGLYURLS=false`).

In my setup, this command typically runs for something like half a minute,
but sometimes takes for as long as 1 minute. (I noticed that it is much
slower when I open the directory in VS Code because I'm running this in
WSL and the filesystem watcher kind of eats all resources.)

After that, I run:

	time lychee --offline --exclude-mail \
	        --exclude file:///path/to/repo.git/ \
		--exclude file:///caminho/para/o/reposit%C3%B3rio.git/ \
		--exclude file:///ruta/a/repositorio.git/ \
		--exclude file:///sl%C3%B3%C3%B0/til/hirsla.git/ \
		--exclude file:///Pfad/zum/Repo.git/ \
		--exclude file:///chemin/du/d%C3%A9p%C3%B4t.git/ \
		--exclude file:///srv/git/project.git \
		--exclude "file://$PWD/public/pagefind/pagefind-ui.css" \
		--format markdown -o lychee-local.md public/

Without `--offline`, there would be a couple of broken links (the
http://git.or.cz/gitwiki/InterfacesFrontendsAndTools link leads to
"Forbidden", it needs to be changed to https://).

The `file:///` URLs are all examples that are not expected to be valid.
And we do not want to check the emails (tons of `xyz@example.com` would be
"broken").

This command typically takes another half minute, sometimes a bit longer.

Given those times and the configurability (and the lure of a GitHub
Action that could be easily integrated into a GitHub workflow:
https://github.com/marketplace/actions/lychee-broken-link-checker), I have
up on linkchecker and focused exclusively on lychee.

Now, when I started working on this on Friday, lychee reported about
12,000 broken links.

There were a couple of legitimate mistakes I made (when feeding paths to
Hugo's `relURL` function, the path must not have a leading slash or it
will remain unchanged, for example). These are fixed.

But there were also many other issues such as some manual page translation
being incomplete yet linking to not-yet-existing pages. In those cases, I
changed he code to generate redirects to the English version. For example,
https://git.github.io/git-scm.com/docs/git-clone/fr#_git has a link to
`git[1]` that _should_ lead to the French version of the `git` manual
page. However, that does not exist. So both the Rails App as well as the
static website redirect to the English variant of that page.

My most recent lychee run results in 0 broken links.

As a bonus, some of the links that are currently broken on
https://git-scm.com/ are fixed in https://git.github.io/git-scm.com/.
For example, following the `Pull Request Referləri` link at the top of
https://git-scm.com/book/az/v2/Appendix-C:-Git-%C6%8Fmrl%C9%99ri-Plumbing-%C6%8Fmrl%C9%99ri/
leads to a 404. But following it in
https://git.github.io/git-scm.com/book/az/v2/Appendix-C:-Git-%C6%8Fmrl%C9%99ri-Plumbing-%C6%8Fmrl%C9%99ri/
directs the browser to the correct URL:
https://git.github.io/git-scm.com/book/az/v2/GitHub-Bir-Layih%C9%99nin-Saxlan%C4%B1lmas%C4%B1/#_pr_refs

Another thing that is broken on https://git-scm.com/ are the footnotes in
the Czech translation of the ProGit book. These were broken in the Hugo
version, too, but now they are fixed. See e.g.
https://dscho.github.io/git-scm.com/book/cs/v2/Z%C3%A1klady-pr%C3%A1ce-se-syst%C3%A9mem-Git-Zobrazen%C3%AD-historie-reviz%C3%AD/#_footnotedef_7
and note that the Rails App redirects to
https://git-scm.com/book/cs/v2/Z%C3%A1klady-pr%C3%A1ce-se-syst%C3%A9mem-Git-Zobrazen%C3%AD-historie-reviz%C3%AD/ch00/_footnotedef_7
when clicking on the `[7]`, which 404s.

Could you double-check that the links in the current version?

Thank you,
Johannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration of git-scm.com to a static web site: ready for review/testing
  2023-11-17 13:25 Migration of git-scm.com to a static web site: ready for review/testing Johannes Schindelin
  2023-11-17 16:26 ` Todd Zullinger
  2023-11-18  9:41 ` Johannes Sixt
@ 2023-11-23 18:53 ` Kaartic Sivaraam
  2 siblings, 0 replies; 9+ messages in thread
From: Kaartic Sivaraam @ 2023-11-23 18:53 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: Matt Burke, Victoria Dye, Matthias Aßhauer, git

Hi Johannes,

On 17/11/23 18:55, Johannes Schindelin wrote:
> 
> To that end, I deployed this branch to GitHub Pages so that anyone
> interested (hopefully many!) can have a look at
> https://git.github.io/git-scm.com/ and compare to the existing
> https://git-scm.com/.
> 

Thanks for hosting it to easily check things!

I gave a quick try at the search and it seems to be behaving a bit 
strangely.

For instance, I searched for 'commit' and 'log'. I was hoping to see the 
corresponding reference page show up in the results but they don't seem 
to show up. At least they don't show up in the first few results. They 
show up in the first few results in the existing site.

Here are some screenshots:

Existing site: https://ibb.co/pZHx9TM

New site:
https://ibb.co/dMpNth3
https://ibb.co/h26J5Rx

This not always the case, though. Some terms like 'checkout' seem to 
bring the relevant results properly in the top results.

-- 
Sivaraam


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Migration of git-scm.com to a static web site: ready for review/testing
  2023-11-21 14:25       ` Johannes Schindelin
@ 2023-11-28  1:54         ` Todd Zullinger
  0 siblings, 0 replies; 9+ messages in thread
From: Todd Zullinger @ 2023-11-28  1:54 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: git, Matt Burke, Victoria Dye, Matthias Aßhauer

[-- Attachment #1: Type: text/plain, Size: 1842 bytes --]

Hi Johannes,

Johannes Schindelin wrote:
> I found https://lychee.cli.rs/#/ in the meantime and figured out how to
> use it in a local setup:

Nice.  That's much faster.

> My most recent lychee run results in 0 broken links.
> 
> As a bonus, some of the links that are currently broken on
> https://git-scm.com/ are fixed in https://git.github.io/git-scm.com/.
> For example, following the `Pull Request Referləri` link at the top of
> https://git-scm.com/book/az/v2/Appendix-C:-Git-%C6%8Fmrl%C9%99ri-Plumbing-%C6%8Fmrl%C9%99ri/
> leads to a 404. But following it in
> https://git.github.io/git-scm.com/book/az/v2/Appendix-C:-Git-%C6%8Fmrl%C9%99ri-Plumbing-%C6%8Fmrl%C9%99ri/
> directs the browser to the correct URL:
> https://git.github.io/git-scm.com/book/az/v2/GitHub-Bir-Layih%C9%99nin-Saxlan%C4%B1lmas%C4%B1/#_pr_refs
> 
> Another thing that is broken on https://git-scm.com/ are the footnotes in
> the Czech translation of the ProGit book. These were broken in the Hugo
> version, too, but now they are fixed. See e.g.
> https://dscho.github.io/git-scm.com/book/cs/v2/Z%C3%A1klady-pr%C3%A1ce-se-syst%C3%A9mem-Git-Zobrazen%C3%AD-historie-reviz%C3%AD/#_footnotedef_7
> and note that the Rails App redirects to
> https://git-scm.com/book/cs/v2/Z%C3%A1klady-pr%C3%A1ce-se-syst%C3%A9mem-Git-Zobrazen%C3%AD-historie-reviz%C3%AD/ch00/_footnotedef_7
> when clicking on the `[7]`, which 404s.
> 
> Could you double-check that the links in the current version?

Since I had it already, I ran linkchecker again.  It found
25 errors.  I'll attach the output, though I'm not sure if
the list will pass it along or not.

It looks like a number of errors are due to '?' characters
in the generated links, e.g.:

https://git.github.io/git-scm.com/book/en/v2/Getting-Started-What-is-Git?/

Cheers,

-- 
Todd

[-- Attachment #2: git.github.io_git-scm.com.txt --]
[-- Type: text/plain, Size: 13508 bytes --]

LinkChecker 10.2.1
Copyright (C) 2000-2016 Bastian Kleineidam, 2010-2022 LinkChecker Authors
LinkChecker comes with ABSOLUTELY NO WARRANTY!
This is free software, and you are welcome to redistribute it under
certain conditions. Look at the file `LICENSE' within this distribution.
Read the documentation at https://linkchecker.github.io/linkchecker/
Write comments and bugs to https://github.com/linkchecker/linkchecker/issues

Start checking at 2023-11-25 18:49:01+000

URL        `/git-scm.com/book/en/v2/Getting-Started-What-is-Git%3F'
Name       `What is Git?'
Parent URL https://git.github.io/git-scm.com/book/en/v2, line 9, col 1
Real URL   https://git.github.io/git-scm.com/book/en/v2/Getting-Started-What-is-Git?/
Check time 2.388 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/en/v2/Getting-Started-What-is-Git?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/az/v2/Ba%c5%9flan%c4%9f%c4%b1c-Git-N%c9%99dir%3F'
Name       `Git Nədir?'
Parent URL https://git.github.io/git-scm.com/book/az/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/az/v2/Ba%C5%9Flan%C4%9F%C4%B1c-Git-N%C9%99dir?/
Check time 0.074 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/az/v2/Ba%C5%9Flan%C4%9F%C4%B1c-Git-N%C9%99dir?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/de/v2/Erste-Schritte-Was-ist-Versionsverwaltung%3F'
Name       `Erste Schritte'
Parent URL https://git.github.io/git-scm.com/book/de/v2, line 5, col 3159
Real URL   https://git.github.io/git-scm.com/book/de/v2/Erste-Schritte-Was-ist-Versionsverwaltung?/
Check time 0.121 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/de/v2/Erste-Schritte-Was-ist-Versionsverwaltung?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/de/v2/Erste-Schritte-Was-ist-Git%3F'
Name       `Was ist Git?'
Parent URL https://git.github.io/git-scm.com/book/de/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/de/v2/Erste-Schritte-Was-ist-Git?/
Check time 0.068 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/de/v2/Erste-Schritte-Was-ist-Git?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/es/v2/Inicio---Sobre-el-Control-de-Versiones-%c2%bfC%c3%b3mo-obtener-ayuda%3F'
Name       `¿Cómo obtener ayuda?'
Parent URL https://git.github.io/git-scm.com/book/es/v2, line 12, col 1
Real URL   https://git.github.io/git-scm.com/book/es/v2/Inicio---Sobre-el-Control-de-Versiones-%C2%BFC%C3%B3mo-obtener-ayuda?/
Check time 0.093 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/es/v2/Inicio---Sobre-el-Control-de-Versiones-%C2%BFC%C3%B3mo-obtener-ayuda?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/es/v2/Ramificaciones-en-Git-%c2%bfQu%c3%a9-es-una-rama%3F'
Name       `Ramificaciones en Git'
Parent URL https://git.github.io/git-scm.com/book/es/v2, line 21, col 111
Real URL   https://git.github.io/git-scm.com/book/es/v2/Ramificaciones-en-Git-%C2%BFQu%C3%A9-es-una-rama?/
Check time 0.094 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/es/v2/Ramificaciones-en-Git-%C2%BFQu%C3%A9-es-una-rama?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/ko/v2/%ec%8b%9c%ec%9e%91%ed%95%98%ea%b8%b0-%eb%b2%84%ec%a0%84-%ea%b4%80%eb%a6%ac%eb%9e%80%3F'
Name       `시작하기'
Parent URL https://git.github.io/git-scm.com/book/ko/v2, line 5, col 3159
Real URL   https://git.github.io/git-scm.com/book/ko/v2/%EC%8B%9C%EC%9E%91%ED%95%98%EA%B8%B0-%EB%B2%84%EC%A0%84-%EA%B4%80%EB%A6%AC%EB%9E%80?/
Check time 0.094 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/ko/v2/%EC%8B%9C%EC%9E%91%ED%95%98%EA%B8%B0-%EB%B2%84%EC%A0%84-%EA%B4%80%EB%A6%AC%EB%9E%80?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/nl/v2/Aan-de-slag-Wat-is-Git%3F'
Name       `Wat is Git?'
Parent URL https://git.github.io/git-scm.com/book/nl/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/nl/v2/Aan-de-slag-Wat-is-Git?/
Check time 0.070 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/nl/v2/Aan-de-slag-Wat-is-Git?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/ru/v2/%d0%92%d0%b2%d0%b5%d0%b4%d0%b5%d0%bd%d0%b8%d0%b5-%d0%a7%d1%82%d0%be-%d1%82%d0%b0%d0%ba%d0%be%d0%b5-Git%3F'
Name       `Что такое Git?'
Parent URL https://git.github.io/git-scm.com/book/ru/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/ru/v2/%D0%92%D0%B2%D0%B5%D0%B4%D0%B5%D0%BD%D0%B8%D0%B5-%D0%A7%D1%82%D0%BE-%D1%82%D0%B0%D0%BA%D0%BE%D0%B5-Git?/
Check time 0.075 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/ru/v2/%D0%92%D0%B2%D0%B5%D0%B4%D0%B5%D0%BD%D0%B8%D0%B5-%D0%A7%D1%82%D0%BE-%D1%82%D0%B0%D0%BA%D0%BE%D0%B5-Git?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/sl/v2/Za%c4%8detek-Kaj-je-Git%3F'
Name       `Kaj je Git?'
Parent URL https://git.github.io/git-scm.com/book/sl/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/sl/v2/Za%C4%8Detek-Kaj-je-Git?/
Check time 0.071 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/sl/v2/Za%C4%8Detek-Kaj-je-Git?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/ru/v2/%d0%92%d0%b2%d0%b5%d0%b4%d0%b5%d0%bd%d0%b8%d0%b5-%d0%9a%d0%b0%d0%ba-%d0%bf%d0%be%d0%bb%d1%83%d1%87%d0%b8%d1%82%d1%8c-%d0%bf%d0%be%d0%bc%d0%be%d1%89%d1%8c%3F'
Name       `Как получить помощь?'
Parent URL https://git.github.io/git-scm.com/book/ru/v2, line 12, col 1
Real URL   https://git.github.io/git-scm.com/book/ru/v2/%D0%92%D0%B2%D0%B5%D0%B4%D0%B5%D0%BD%D0%B8%D0%B5-%D0%9A%D0%B0%D0%BA-%D0%BF%D0%BE%D0%BB%D1%83%D1%87%D0%B8%D1%82%D1%8C-%D0%BF%D0%BE%D0%BC%D0%BE%D1%89%D1%8C?/
Check time 0.077 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/ru/v2/%D0%92%D0%B2%D0%B5%D0%B4%D0%B5%D0%BD%D0%B8%D0%B5-%D0%9A%D0%B0%D0%BA-%D0%BF%D0%BE%D0%BB%D1%83%D1%87%D0%B8%D1%82%D1%8C-%D0%BF%D0%BE%D0%BC%D0%BE%D1%89%D1%8C?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/sr/v2/%d0%9f%d0%be%d1%87%d0%b5%d1%82%d0%b0%d0%ba-%d0%a8%d1%82%d0%b0-%d1%98%d0%b5-%d0%93%d0%b8%d1%82%3F'
Name       `Шта је Гит?'
Parent URL https://git.github.io/git-scm.com/book/sr/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/sr/v2/%D0%9F%D0%BE%D1%87%D0%B5%D1%82%D0%B0%D0%BA-%D0%A8%D1%82%D0%B0-%D1%98%D0%B5-%D0%93%D0%B8%D1%82?/
Check time 0.084 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/sr/v2/%D0%9F%D0%BE%D1%87%D0%B5%D1%82%D0%B0%D0%BA-%D0%A8%D1%82%D0%B0-%D1%98%D0%B5-%D0%93%D0%B8%D1%82?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/uz/v2/%d0%98%d1%88-%d0%b1%d0%be%d1%88%d0%bb%d0%b0%d0%bd%d0%b8%d1%88%d0%b8-%d2%9a%d0%b0%d0%bd%d0%b4%d0%b0%d0%b9-%d1%91%d1%80%d0%b4%d0%b0%d0%bc-%d0%be%d0%bb%d0%b8%d1%88-%d0%bc%d1%83%d0%bc%d0%ba%d0%b8%d0%bd%3F'
Name       `Қандай ёрдам олиш мумкин?'
Parent URL https://git.github.io/git-scm.com/book/uz/v2, line 12, col 1
Real URL   https://git.github.io/git-scm.com/book/uz/v2/%D0%98%D1%88-%D0%B1%D0%BE%D1%88%D0%BB%D0%B0%D0%BD%D0%B8%D1%88%D0%B8-%D2%9A%D0%B0%D0%BD%D0%B4%D0%B0%D0%B9-%D1%91%D1%80%D0%B4%D0%B0%D0%BC-%D0%BE%D0%BB%D0%B8%D1%88-%D0%BC%D1%83%D0%BC%D0%BA%D0%B8%D0%BD?/
Check time 0.074 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/uz/v2/%D0%98%D1%88-%D0%B1%D0%BE%D1%88%D0%BB%D0%B0%D0%BD%D0%B8%D1%88%D0%B8-%D2%9A%D0%B0%D0%BD%D0%B4%D0%B0%D0%B9-%D1%91%D1%80%D0%B4%D0%B0%D0%BC-%D0%BE%D0%BB%D0%B8%D1%88-%D0%BC%D1%83%D0%BC%D0%BA%D0%B8%D0%BD?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/be/v2/%d0%9f%d0%b5%d1%80%d1%88%d1%8b%d1%8f-%d0%ba%d1%80%d0%be%d0%ba%d1%96-What-is-Git%3F'
Name       `What is Git?'
Parent URL https://git.github.io/git-scm.com/book/be/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/be/v2/%D0%9F%D0%B5%D1%80%D1%88%D1%8B%D1%8F-%D0%BA%D1%80%D0%BE%D0%BA%D1%96-What-is-Git?/
Check time 0.105 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/be/v2/%D0%9F%D0%B5%D1%80%D1%88%D1%8B%D1%8F-%D0%BA%D1%80%D0%BE%D0%BA%D1%96-What-is-Git?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/it/v2/Per-Iniziare-Cos%e2%80%99%c3%a9-Git%3F'
Name       `Cos’é Git?'
Parent URL https://git.github.io/git-scm.com/book/it/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/it/v2/Per-Iniziare-Cos%E2%80%99%C3%A9-Git?/
Check time 0.068 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/it/v2/Per-Iniziare-Cos%E2%80%99%C3%A9-Git?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/ms/v2/Getting-Started-What-is-Git%3F'
Name       `What is Git?'
Parent URL https://git.github.io/git-scm.com/book/ms/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/ms/v2/Getting-Started-What-is-Git?/
Check time 0.078 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/ms/v2/Getting-Started-What-is-Git?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/sv/v2/Kom-ig%c3%a5ng-Vad-%c3%a4r-Git%3F'
Name       `Vad är Git?'
Parent URL https://git.github.io/git-scm.com/book/sv/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/sv/v2/Kom-ig%C3%A5ng-Vad-%C3%A4r-Git?/
Check time 0.086 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/sv/v2/Kom-ig%C3%A5ng-Vad-%C3%A4r-Git?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/tr/v2/Ba%c5%9flang%c4%b1%c3%a7-Git-Nedir%3F'
Name       `Git Nedir?'
Parent URL https://git.github.io/git-scm.com/book/tr/v2, line 8, col 1
Real URL   https://git.github.io/git-scm.com/book/tr/v2/Ba%C5%9Flang%C4%B1%C3%A7-Git-Nedir?/
Check time 0.076 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/book/tr/v2/Ba%C5%9Flang%C4%B1%C3%A7-Git-Nedir?/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/book/fr/v2/Les-branches-avec-Git-Branches-et-fusions'
Name       `next'
Parent URL https://git.github.io/git-scm.com/book/fr/v2/Les-branches-avec-Git-Les-branches-en-bref, line 160, col 1436
Real URL   https://git.github.io/git-scm.com/book/fr/v2/Les-branches-avec-Git-Branches-et-fusions
Check time 1.887 seconds
Size       1KB
Result     Error: 404 Not Found

URL        `/git-scm.com/docs/git-submodules'
Parent URL https://git.github.io/git-scm.com/docs/git-submodules/fr, line 2, col 1
Real URL   https://git.github.io/git-scm.com/docs/git-submodules/
Check time 5.581 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/docs/git-submodules/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/docs/git-maintainance'
Parent URL https://git.github.io/git-scm.com/docs/git-maintainance/is, line 2, col 1
Real URL   https://git.github.io/git-scm.com/docs/git-maintainance/
Check time 5.867 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/docs/git-maintainance/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/docs/gitignorar'
Parent URL https://git.github.io/git-scm.com/docs/gitignorar/pt_BR, line 2, col 1
Real URL   https://git.github.io/git-scm.com/docs/gitignorar/
Check time 5.546 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/docs/gitignorar/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/docs/git-fsmonitor----daemon'
Parent URL https://git.github.io/git-scm.com/docs/git-fsmonitor----daemon/pt_BR, line 2, col 1
Real URL   https://git.github.io/git-scm.com/docs/git-fsmonitor----daemon/
Check time 5.616 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/docs/git-fsmonitor----daemon/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/docs/git-pack'
Parent URL https://git.github.io/git-scm.com/docs/git-pack/pt_BR, line 2, col 1
Real URL   https://git.github.io/git-scm.com/docs/git-pack/
Check time 4.311 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/docs/git-pack/'.
Result     Error: 404 Not Found

URL        `/git-scm.com/docs/git-hash'
Parent URL https://git.github.io/git-scm.com/docs/git-hash/fr, line 2, col 1
Real URL   https://git.github.io/git-scm.com/docs/git-hash/
Check time 1.156 seconds
Size       1KB
Info       Redirected to
           `https://git.github.io/git-scm.com/docs/git-hash/'.
Result     Error: 404 Not Found

Statistics:
Downloaded: 302.2MB.
Content types: 5656 image, 11838 text, 0 video, 0 audio, 15 application, 29 mail and 620 other.
URL lengths: min=15, max=841, avg=72.

That's it. 18158 links in 19951 URLs checked. 0 warnings found. 25 errors found.
Stopped checking at 2023-11-25 19:34:52+000 (45 minutes, 51 seconds)

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-11-28  1:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-17 13:25 Migration of git-scm.com to a static web site: ready for review/testing Johannes Schindelin
2023-11-17 16:26 ` Todd Zullinger
2023-11-18  1:14   ` Johannes Schindelin
2023-11-18  2:57     ` Todd Zullinger
2023-11-21 14:25       ` Johannes Schindelin
2023-11-28  1:54         ` Todd Zullinger
2023-11-18  9:41 ` Johannes Sixt
2023-11-18  9:46   ` Johannes Schindelin
2023-11-23 18:53 ` Kaartic Sivaraam

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).