Is offloading to GPU a worthwhile feature?

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Is offloading to GPU a worthwhile feature?
@ 2018-02-27 20:52 Konstantin Ryabitsev
  2018-02-27 22:08 ` Stefan Beller
  2018-04-08 13:59 ` Jakub Narebski
  0 siblings, 2 replies; 5+ messages in thread
From: Konstantin Ryabitsev @ 2018-02-27 20:52 UTC (permalink / raw)
  To: git

[-- Attachment #1.1: Type: text/plain, Size: 770 bytes --]

Hi, all:

This is an entirely idle pondering kind of question, but I wanted to
ask. I recently discovered that some edge providers are starting to
offer systems with GPU cards in them -- primarily for clients that need
to provide streaming video content, I guess. As someone who needs to run
a distributed network of edge nodes for a fairly popular git server, I
wondered if git could at all benefit from utilizing a GPU card for
something like delta calculations or compression offload, or if benefits
would be negligible.

I realize this would be silly amounts of work. But, if it's worth it,
perhaps we can benefit from all the GPU computation libs written for
cryptocoin mining and use them for something good. :)

Best,
-- 
Konstantin Ryabitsev

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Is offloading to GPU a worthwhile feature?
  2018-02-27 20:52 Is offloading to GPU a worthwhile feature? Konstantin Ryabitsev
@ 2018-02-27 22:08 ` Stefan Beller
  2018-04-08 13:59 ` Jakub Narebski
  1 sibling, 0 replies; 5+ messages in thread
From: Stefan Beller @ 2018-02-27 22:08 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: git

On Tue, Feb 27, 2018 at 12:52 PM, Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
> compression offload

Currently there is a series under review that introduces a commit graph
file[1], which would allow to not need decompressing objects for a rev walk, but
have the walking information as-needed on disk.

Once walking (as part of negotiation) is done,
we'd have to pack a pack file to return to the client,
which maybe can be improved by GPU acceleration[2].

Though once upon a time Junio had proposed to change
this part of the protocol as well. Instead of having a packfile
catered to a specific user/request, the server would store
multiple pack files, which are temporally sorted, e.g.
one "old" packfile containing everything that is roughly older
than 4 weeks ago, then a "medium pack file" that is up to last
weekend, and then a "new" pack that is just this weeks work,
followed by the on-demand pack that is just the latest
generated on the fly.

The server would just dump these different packfiles
(concatenated?) at the user, and would need to refresh
its packfiles occasionally every week or so.

[1] https://public-inbox.org/git/1519698787-190494-1-git-send-email-dstolee@microsoft.com/
[2] http://on-demand.gputechconf.com/gtc/2014/presentations/S4459-parallel-lossless-compression-using-gpus.pdf

> I realize this would be silly amounts of work. But, if it's worth it,
> perhaps we can benefit from all the GPU computation libs written for
> cryptocoin mining and use them for something good. :)

Currently there is work being done on "protocol v2"[3], which
is also motivated by the desire to have easy extensibility in the protocol,
so if you want to put in a cryptocoin-secret-handshake [into the protocol]
that improves the cost of compute or the bandwidth required for your
typical use case, it will be possible to do so with ease.

[3] https://public-inbox.org/git/20180207011312.189834-1-bmwill@google.com/

I wonder if the bitmap code can be sped up using GPUs. Every once in a while
the discussion floats up bloom filters or inverse bloom filters for
the negotiation
part, and a quick search shows that those could also be sped up using GPUs.

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Is offloading to GPU a worthwhile feature?
  2018-02-27 20:52 Is offloading to GPU a worthwhile feature? Konstantin Ryabitsev
  2018-02-27 22:08 ` Stefan Beller
@ 2018-04-08 13:59 ` Jakub Narebski
  2018-04-09 17:57   ` Konstantin Ryabitsev
  1 sibling, 1 reply; 5+ messages in thread
From: Jakub Narebski @ 2018-04-08 13:59 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: git

Hello,

Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes:

> This is an entirely idle pondering kind of question, but I wanted to
> ask. I recently discovered that some edge providers are starting to
> offer systems with GPU cards in them -- primarily for clients that need
> to provide streaming video content, I guess. As someone who needs to run
> a distributed network of edge nodes for a fairly popular git server, I
> wondered if git could at all benefit from utilizing a GPU card for
> something like delta calculations or compression offload, or if benefits
> would be negligible.
>
> I realize this would be silly amounts of work. But, if it's worth it,
> perhaps we can benefit from all the GPU computation libs written for
> cryptocoin mining and use them for something good. :)

The problem is that you need to transfer the data from the main memory
(host memory) geared towards low-latency thanks to cache hierarchy, to
the GPU memory (device memory) geared towards bandwidth and parallel
access, and back again.  So to make sense the time for copying data plus
the time to perform calculations on GPU (and not all kinds of
computations can be speed up on GPU -- you need fine-grained massively
data-parallel task) must be less than time to perform calculations on
CPU (with multi-threading).

Also you would need to keep non-GPU and GPGPU code in sync.  Some parts
of code do not change much; and there also solutions to generate dual
code from one source.

Still, it might be good idea,
-- 
Jakub Narębski

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Is offloading to GPU a worthwhile feature?
  2018-04-08 13:59 ` Jakub Narebski
@ 2018-04-09 17:57   ` Konstantin Ryabitsev
  2018-04-11 16:46     ` Jakub Narebski
  0 siblings, 1 reply; 5+ messages in thread
From: Konstantin Ryabitsev @ 2018-04-09 17:57 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git


[-- Attachment #1.1: Type: text/plain, Size: 1976 bytes --]

On 04/08/18 09:59, Jakub Narebski wrote:
>> This is an entirely idle pondering kind of question, but I wanted to
>> ask. I recently discovered that some edge providers are starting to
>> offer systems with GPU cards in them -- primarily for clients that need
>> to provide streaming video content, I guess. As someone who needs to run
>> a distributed network of edge nodes for a fairly popular git server, I
>> wondered if git could at all benefit from utilizing a GPU card for
>> something like delta calculations or compression offload, or if benefits
>> would be negligible.
> 
> The problem is that you need to transfer the data from the main memory
> (host memory) geared towards low-latency thanks to cache hierarchy, to
> the GPU memory (device memory) geared towards bandwidth and parallel
> access, and back again.  So to make sense the time for copying data plus
> the time to perform calculations on GPU (and not all kinds of
> computations can be speed up on GPU -- you need fine-grained massively
> data-parallel task) must be less than time to perform calculations on
> CPU (with multi-threading).

Would something like this be well-suited for tasks like routine fsck,
repacking and bitmap generation? That's the kind of workloads I was
imagining it would be most well-suited for.

> Also you would need to keep non-GPU and GPGPU code in sync.  Some parts
> of code do not change much; and there also solutions to generate dual
> code from one source.
> 
> Still, it might be good idea,

I'm still totally the wrong person to be implementing this, but I do
have access to Packet.net's edge systems which carry powerful GPUs for
projects that might be needing these for video streaming services. It
seems a shame to have them sitting idle if I can offload some of the
RAM- and CPU-hungry tasks like repacking to be running there.

Best,
-- 
Konstantin Ryabitsev
Director, IT Infrastructure Security
The Linux Foundation


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Is offloading to GPU a worthwhile feature?
  2018-04-09 17:57   ` Konstantin Ryabitsev
@ 2018-04-11 16:46     ` Jakub Narebski
  0 siblings, 0 replies; 5+ messages in thread
From: Jakub Narebski @ 2018-04-11 16:46 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: git

Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes:
> On 04/08/18 09:59, Jakub Narebski wrote:

>>> This is an entirely idle pondering kind of question, but I wanted to
>>> ask. I recently discovered that some edge providers are starting to
>>> offer systems with GPU cards in them -- primarily for clients that need
>>> to provide streaming video content, I guess. As someone who needs to run
>>> a distributed network of edge nodes for a fairly popular git server, I
>>> wondered if git could at all benefit from utilizing a GPU card for
>>> something like delta calculations or compression offload, or if benefits
>>> would be negligible.
>> 
>> The problem is that you need to transfer the data from the main memory
>> (host memory) geared towards low-latency thanks to cache hierarchy, to
>> the GPU memory (device memory) geared towards bandwidth and parallel
>> access, and back again.  So to make sense the time for copying data plus
>> the time to perform calculations on GPU (and not all kinds of
>> computations can be speed up on GPU -- you need fine-grained massively
>> data-parallel task) must be less than time to perform calculations on
>> CPU (with multi-threading).
>
> Would something like this be well-suited for tasks like routine fsck,
> repacking and bitmap generation? That's the kind of workloads I was
> imagining it would be most well-suited for.

All of those, I think, would need to use some graph algorithms.  While
there are here ready graph libraries on GPU (like nVidia's nvGRAPH),
graphs are irregular structures not that well souted to the SIMD type of
parallelism that GPU is best for.

I also wonder if the amound of memory on GPU would be enough (and if
not, would be it possible to perform calculations in batches).

>> Also you would need to keep non-GPU and GPGPU code in sync.  Some parts
>> of code do not change much; and there also solutions to generate dual
>> code from one source.
>> 
>> Still, it might be good idea,
>
> I'm still totally the wrong person to be implementing this, but I do
> have access to Packet.net's edge systems which carry powerful GPUs for
> projects that might be needing these for video streaming services. It
> seems a shame to have them sitting idle if I can offload some of the
> RAM- and CPU-hungry tasks like repacking to be running there.

Happily, GPGPU programming (in CUDA C mainly, which limits use to nVidia
hardware) is one of my areas if interests...

Best regards,
--
Jakub Narębski

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-04-11 16:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-27 20:52 Is offloading to GPU a worthwhile feature? Konstantin Ryabitsev
2018-02-27 22:08 ` Stefan Beller
2018-04-08 13:59 ` Jakub Narebski
2018-04-09 17:57   ` Konstantin Ryabitsev
2018-04-11 16:46     ` Jakub Narebski

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).