Limitation of Effective Resistance in ML and How $p$-Resistance Overcomes the Limitations

July 22, 2023 6 minute read

In this article, I’d like to explain the limitations of effective resistance in the ML area and how its generalization to $p$ overcomes this problem.

Note: This article is a promotion of our recent paper at ICML.

TL; DR

The 2-resistance has some limitation, which is overcome by generalizing from 2-resistance to $p$-resistance.

The Opener

Effective resistance is a mathematically fun tool. Due to its fun properties, effective resistance has fascinated people. At the same time, the effective resistance is slightly away from the graph machine learning limelight, while as time goes by, spectral clustering and then GNN enjoy the spot.

This may have a reason. Below I’ll explain this, but before that, I’d want to establish the notations briefly.

Recap of Effective Resistance

Let $G=(V,E)$ be a graph. The effective resistance over a graph is considered from an analogy between a graph and a circuit. We define the energy over a graph as

\[S_{G,2} (\mathbf{x}) := \sum_{i,j \in V} a_{ij} |x_{i} - x_{j}|^2,\]

where $\mathbf{x}$ is a voltage distribution. The effective resistance between vertices $i$ and $j$ defined as

\[r_{G,2}(i,j) := 1/(\min_{\mathbf{x}} \{S_{G,2} (\mathbf{x})\quad \mathrm{s.t}\ x_{i} - x_{j} = 1\}).\]

Note that the effective resistance can be thought of as the inverse of the two-pole semi-supervised learning (SSL) problem, i.e., the SSL problem with two fixed labels.

This effective resistance is generalized to $p$-resistance as follows. We consider energy as

\[S_{G,p} (\mathbf{x}) := \sum_{i,j \in V} a_{ij} |x_{i} - x_{j}|^p,\]

and the effective $p$-resistance as

\[r_{G,p}(i,j) := 1/(\min_{\mathbf{x}} \{S_{G,p} (\mathbf{x})\quad \mathrm{s.t}\ x_{i} - x_{j} = 1\}).\]

The above is a dry collection of definitions of effective resistance, and all the funs are eliminated. I refer to my previous posts (post on effective resistance and effective $p$-resistance) for fun details.

One wet thing is that we see the connection between SSL and effective resistance.

Let us consider the following SSL problem.

\[\mathbf{x}^{ij,p*}:=\text{argmin}_{\mathbf{x}} S_{G,p} (\mathbf{x}) \quad \text{s.t. } x_{i} - x_{j} = 1 \quad (\dagger)\]

Then, for the $p=2$ case, the solution of this SSL $\mathbf{x}^{ij,2*}$ and the effective resistance has a connection as

\[x^{ij,2*}_{i} - x^{ij,2*}_{\ell} \geq x^{ij,2*}_{\ell} - x^{ij,2*}_{j} \iff r_{G,2} (i, \ell) \geq r_{G,2} (\ell,j) \quad (\ast)\]

for any third point $\ell$. This shows the equivalence of the effective resistance and the solution of the SSL problem. This is helpful, but it’s been an open problem if this holds for the general $p>1$. See the paper by Alamgir et al. (2010, NIPS) for the details.

Spoiler Alert: Our new paper resolves this open problem positively.

Limitation of Effective Resistance

Let’s get back to where we were. What I wanted to discuss is why this effective resistance has not been hitting the limelight.

Below we consider the scenario where the graph is constructed from vector data by some method, such as $\epsilon$-graph, $k$-NN graph, and Gaussian kernel.

I thought this might have a reason. The reason I speculate is that several papers pointed out the limitations. Consider constructing a graph from vector data, such as Gaussian kernel or $\epsilon$-graph. Under this assumption, the major papers point out this limitation, e.g.,

Nadler et al. Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data In Proc. NIPS, 2009.
von Luxburg et al. Getting lost in space: Large sample analysis of the resistance distance In Proc. NIPS, 2010

What sensational headlines from a tabloid called NIPS¹. This sounds scandalous for effective resistance. Although the effective resistance is a charming folk, this folk might have resigned from the position if the effective resistance were a politician.

I made a joke here, but the contents of these papers are solid indeed. The effective resistance is reported to converge to the meaningless function if you draw an infinite number of the data points, i.e., in the asymptotic condition,

\[r_{G,2} (i,j) \to \frac{1}{d_{i}} + \frac{1}{d_{j}},\]

where $d_{j} := \sum_{i \in V} a_{ij}$ is a degree of the vertex $j$.

We probably thought that the effective resistance is a more cool folk, i.e., reflecting a more nuanced graph structure. However, the effective resistance does not reflect the whole graph structure. Instead, under some circumstances, this only considers partial structures, the number of vertices connected to $i$ and $j$. This sounds less sexy.

Also, Nadler et al. showed that under some similar assumptions to von Luxburg, the solution of the SSL problem converges to constant except for the near constraints if we draw an infinite number of data points. Therefore, since the solution is almost constant, no learning occurs in the SSL problem $(\dagger)$.

It is not hard to expect that the solution to the SSL problem and the 2-resistance is related since the denominator of the 2-resistance is actually the SSL problem. However, it is not trivial how these are related. Now, the relationship Equation $(\ast)$ shows that Nadler et al. and von Luxburg corroborate each other. If one of the SSL or effective resistance is meaningless, the other is also meaningless.

To sum up this section, the effective resistance is less sexy than we thought. Less sexy is my word, but the papers describe it as meaningless or having a limit. This is one of the reasons I’m speculating why the effective resistance doesn’t enjoy the spot.

Here is a hero, $p$

How do we overcome this problem? The generalization from 2-resistance to $p$-resistance will help. In short, larger $p$ overcomes this problem.

Alamgir et al. (2010, NIPS) shows that under some assumptions, the $p$-resistance converges to

\[r_{G,p}(i,j) \to \frac{1}{d^{p-1}_{i}} + \frac{1}{d^{p-1}_{j}},\]

that is also a meaningless function when $p$ is small. However, when $p$ is large, the $p$-resistance converges to a value associated with a more nuanced graph structure.

Furthermore, the solution of the semi-supervised learning for general $p>1$ converges to a non-meaningless function when $p$ is large; Alaoui et al. (2016, JMLR) showed the case of $p$ is an integer, and Slepcev et al. (2019, SIMA) generalized the result to the general $p>1$. For small $p$ again, the solution converges to a meaningless function.

Again, those two seem related. Do we have some relationship like Equation $(\ast)$? It’s been an open problem if this holds for any $p>1$ or not. Now, our recent paper resolved this positively, i.e.,

\[x^{ij,p*}_{i} - x^{ij,p*}_{\ell} \geq x^{ij,p*}_{\ell} - x^{ij,p*}_{j} \iff r_{G,p} (i, \ell) \geq r_{G,p} (\ell,j) \quad (\ddagger).\]

The relationship $(\ddagger)$ shows that Alamgir et al. and Slepcev corroborate each other. If one of the SSL or effective resistance is meaningless, the other is also meaningless when $p$ is small, while when $p$ is large, if the one is meaningful, the other is also meaningful.

Hero Isn’t Perfect

By generalizing from 2 to $p$, the limitation of resistance is overcome. Yes, peace prevail on earth again. But, now the $p$-resistance gets the center of the graph ML and may dance under the limelight… But this thing doesn’t happen so far.

The reason is that, $p$-resistance is computationally expensive, especially for the many pairs. The 2-resistance has some way we can compute faster. However, we have not had such a thing for $p$-resistance. Our recent paper proposes some way for this.

Conclusion

While the effective resistance is fun, the resistance enjoys the limelight less in the ML area. This might be due to some limit in the asymptotic condition. This article promotes that if we generalize from 2-resistance to $p$-resistance, this limitation is overcome when $p$ is large.

Note:

The papers discussing the SSL in this blog post showed the results not only for the two known label cases but also for any number of known labels. Also, since the assumptions vary from paper to paper, if you reuse the results, please check each paper before you use it. The relationships $(\ast)$ and $(\ddagger)$ hold for any graph that is not necessarily made from vector data.

Advert:

Again, our paper proposes some approximated way where we can compute the $p$-resistance faster. If you are interested in, please read our paper, or see my previous blog article.

Citation:

If you find this piece useful, please cite our paper. Appendix A provides a comprehensive review of the topic of this article, the limitation of the $p$-Laplacian for multi-class clustering.

@inproceedings{saito2023multi,
 title={Multi-class Graph Clustering via Approximated Effective $p$-Resistance.},
 author={Saito, Shota and Herbster, Mark},
 booktitle={Proc. ICML},
 pages = {29697--29733},
 year={2023}
}

Just to follow up. This is just a joke. ↩

Twitter Facebook LinkedIn

Shota Saito

Limitation of Effective Resistance in ML and How $p$-Resistance Overcomes the Limitations

TL; DR

The Opener

Recap of Effective Resistance

Limitation of Effective Resistance

Here is a hero, $p$

Hero Isn’t Perfect

Conclusion

You May Also Enjoy

What is the Counterpart of Gaussian in a Hypergraph with respect to Graph?

Multi-class Graph Clustering via Approximated $p$-Resistance (Short Preview of Paper)

Graph to Effective $p$-resistance, its Approximation, and its Application to Multi-class Clustering

Effective Resistance and Graph