Friday, June 20, 2014

See You in Prague

For those of you going to SPAA this coming week, I'll see you there.  I'll be giving the last two talks at the conference, to what I expect (based on the timing) will be a nearly empty room.  That just means there will be no pressure.

If you want to hear more about the papers, you can go to Abstract Talk, where I talk about the papers.  Here is the link for the paper Balanced Allocations and Double Hashing, and the link for the paper Parallel Peeling Algorithms.  I haven't done podcasts for Abstract Talk before, so be forgiving if you go to listen.  It seems like a cool idea;  what do people think of it in practice? 

For those who expect to be sightseeing in Prague during the final session (or who just aren't going to SPAA), here's the brief overview. 

For Balanced Allocations and Double Hashing:
In the well-known balanced allocations paradigm, balls are hashed sequentially into bins, where each ball gets d random choices from the hash functions, and is then placed in the least loaded.  With double hashing, we replace the d random choices with d choices of the form a, a+b, a+2b, a+3b,... a+(d-1)b, where a and b are random values (determined by hashing).  That is, we build the d choices from 2 random numbers instead of using d random numbers.  (The numbers are taken mod the size of the hash table, and b should be relatively prime to the hash table size... let's stop worrying about details.)  We find empirically that this makes no difference, in a very strong sense;  the fraction of bins with load j appears the same for every value of j for both systems, so you can't really tell them apart.  We provide a theoretical explanation, based on fluid limit models, for why this happens.

For Parallel Peeling Algorithms:
The analysis of several algorithms and data structures can be framed as a peeling process on a random hypergraph: vertices with degree less than k are removed until there are no vertices of degree less than k left. The remaining hypergraph is known as the k-core.  We analyze parallel peeling processes, where in each round, all vertices of degree less than k are removed. It is known that, below a specific edge density threshold, the k-core is empty
with high probability. We show that, with high probability, below this threshold, only O(log log n) rounds of peeling are needed to obtain the empty k-core for r-uniform hypergraphs. Interestingly, above this threshold, Ω(log n) rounds of peeling are required to find the non-empty k-core. Since most algorithms and data structures aim to peel to an empty k-core  this asymmetry appears fortunate;  nature is on our side. We verify the theoretical results both with simulation and with a parallel implementation using graphics processing units (GPUs). Our implementation provides insights into how to structure parallel peeling algorithms for efficiency in practice.

The Parallel Peeling Algorithms paper was, to my surprise, awarded Best Paper.  Maybe I'll write up more about the surprise at some point, but I'd certainly like to thank the committee for the honor, and pass the credit to where it is due, with my co-authors Jiayang Jiang and Justin Thaler. 


NSF Thanks for the Year

It's that time of year where, as a background process, I have to do my annual reviews for the NSF.  It's generally not a very exciting task, and their online forms remain, I think, unpleasantly designed.  (I think they fixed a problem I had last year, so at least now they point out to you what you haven't filled out more clearly.)

That being said, this year, even more than others, I find myself gratefully filling them out.  The NSF provides the money for my (and my students') research.  And research is fun.  In my few years playing administrator, I was trying to keep up with my research, but inevitably my available time and corresponding output started to decline.  This year, I've been able to "get back into it", and it made me realize how much I enjoy it.  Sure it's often frustrating.  And writing it down (and dealing with conferences, reviews, etc.) can also be frustrating.  But overall the creation process of research, including all the frustrating parts, is the most enjoyable part of the job*, and I'm glad that the NSF is there to support it.  

Thanks NSF.  I'll try to have those reports in well before the nominal deadline....

[* Yes, I like teaching and interacting with students too -- otherwise I'd look for work in one of the many great research labs -- that's the other most enjoyable part of the job.]  

Monday, June 16, 2014

Sad News: Berthold Vöcking

I have just seen the news that Berthold Vöcking passed away.  For those who didn't know him, Berthold was an oustanding researcher in algorithms.  We had several shared interests and were of a similar age;  I always enjoyed his work, and very much enjoyed the few times that I worked with him.  His paper How Asymmetry Helps Load Balancing still amazes me, and is one of the most wonderful papers that I wish I had written.  When I first saw it I simply didn't believe the main result*;  I stopped my other work, wrote up a simulation, and found it matched his analysis.  I then had to spend the next few days understanding it.  Once I understood it, I wondered how he had seen it, and how I had missed it.  It is a truly inspired paper.  

Of course he has other great papers, including the well known Tight Bounds for Worst-Case Equilibria, and maybe the less well known but very interesting (particularly to me) work on Balanced Allocations:  The Heavily Loaded Case.  He had just won a best paper award for ICALP for work on Online Independent Sets.  He was one of the editors of Algorithms Unplugged, a wonderful book project. 

Berthold was regularly on my list of people to invite to workshops, because he always had very interesting work to present and was interesting to talk to.  I can't believe I won't have another chance to talk with him.  His passing is a loss to our community.  

* For those who care, his result was that when hashing using the "power of two choices", suppose that instead of having each new item make two random choices and then placing the item in the least loaded, you split the table into two equal halves (call them left and right), have each item make a random choice from each half, and then place the item in the least loaded, except that you always break ties to the "left half".  There wouldn't seem to be any difference between the two schemes, but there is;  the split-and-break-ties approach works significantly better.