Talk:Moore–Penrose inverse/Archive 1

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Generalization to linear mappings

I removed

The pseudoinverse is defined as above for linear mappings in general (not checked in books, can somebody confirm?).

No its not defined on a linear map between arbitrary vector spaces. Of course one needs at least an inner product space for it to make sense. There are some operators on infinite dimensional Hilbert spaces with generalized inverses. For example construct an operator on l² with a suitable singular value decomposition and use the SVD to construct the generalized inverse. User:Billlion 11:33, 7 Sep 2004

What I would like to do is to move away from the requirement that "A is a matrix". As far as I can see, there is no need to refer to a basis. Would "linear operator on an inner product space of finite dimensionality" be sufficient? RainerBlome 22:15, 19 Sep 2004 (UTC)

I think the last thing this article needs is more generality. The point is that mostly people who are interested in teh pseudoinverse are practical scientists and engineers using it to solve least-sqyuares problems, if you start the article by talking about linear transformations in inner product spaces you have lost most of them already. Fine to explain that it the notion is independent of baisis later on, and hence defined for linear maps, but the article should definitely start of describing the case for a matrix. Billlion 07:59, 20 Sep 2004 (UTC)

Good point. My intent is not really to convert from matrix to LinOp. Rather, I am trying to find out under what conditions the stuff in the article still holds in the case of LinOps. You could say I want to see whether there is a different point of view. If so, it should be shown in the article. However, I have been unable to find out for sure so far.

What would be the first thing that this article needs, in your opinion? RainerBlome 14:25, 20 Sep 2004 (UTC)

When I teach this stuff I start by explaining the need for least squares 'solutions' of overdetermined systems, and least norm solutions of underdetermined ones. In most practical situations where linear algebra is applied the system to be solved is overdetermined as sensible scintists always use more measuremnets that they have unknowns. Without this part of the story it looks the pseudoinverse like it is just contrived as an algebraist's whim. Billlion 19:20, 20 Sep 2004 (UTC)

The idea of a pseudo-inverse extends to bounded linear operators on arbitrary Hilbert spaces (= complete inner product spaces). The only delicate issue compared to finite-dimensional spaces is that in the general the pseudo-inverse of a bounded operator is not necessarily bounded and thus not necessarily continuous, which can pose problems in numerical simulations of inverse problems (such as backward solution of PDEs) --128.130.51.88 16:15, 12 August 2005 (UTC)

Finding the pseudoinverse of a matrix

Hi, Forgive me if I haven't made this comment in the correct syntax, I'm new to this. In the 'Finding the pseudoinverse of a matrix' section you supply two formulas for the psuedoinverse. Only one of these formula (A+ = inv(A*A) A*) worked for some real example matrices for which I wanted to find the psuedoinverse. The other formula (A+ = A* inv(AA*)) was incompuatable, because the inverse of (AA*) did not exist. It may be that which formula you should use depends on the matrix for which you want the psuedoinverse, in that which formula you use is determined by whether the rank comes from the rows or columns. However the text does not make this clear. regards Desmond Campbell 194.83.139.91 10:27, 7 November 2005 (UTC) d dot campbell at iop dot kcl dot ac dot uk

Right, which formula may be used depends on the column and row ranks. See the special cases section. --RainerBlome 21:21, 7 November 2005 (UTC)

Which categories to choose?

Jitse, why did you remove the article from the Linear Algebra category? Note that the article's first sentence implies that LA is a category for this article. --RainerBlome 21:35, 10 September 2005 (UTC)

I removed the article from the Linear Algebra category, because it is already in the Matrix Theory category, which implies that it's Linear Algebra. I don't see why it should also be explicitly in the Linear Algebra category, just like it's not in the Mathematics category even though one could just as well say that the first sentence implies that mathematics is a category for this article. -- Jitse Niesen (talk) 21:49, 10 September 2005 (UTC)

Itemization Markup

Is it really helpful to add punctuation (semicolons and a final full stop) at the end of itemization items? The goals for me are:

1. Clarity -- Make it easy to read.

2. Editability -- Make it easy to maintain, which enhances the chances of future edits being correct, which reduces correction workload, which leaves more time for content development.

To 1.: Mathematics books on my shelf are divided in semicolon usage, some use them, some don't. In my perception, terminating semicolons make equation items harder to read, so I prefer to leave them out. The itemization symbols (bullets) provide enough separation as they are. If an item contains text and not just an equation, a full stop is sometimes more appropriate. When punctuation is used, separating it from the equation by a little whitespace makes the equation easier to read.

To 2.: When items are added or moved, the punctuation may have to be changed. This is often overlooked. (Minor point)

I always use punctuation in lists; full stops if the items in a list are full sentences and semicolons if they are not. Most maths style books say that equations should be treated as normal text and that punctuation should be included (to be frank, I'm not sure that most say that, but I have never come across one that said that you can leave out punctuation). Separating punctuation from the equation seems rather strange to me.

You may want to read a related discussion at Wikipedia talk:WikiProject Mathematics/Archive7#periods at the end of formulas -- request for comment, in which it was suggested that engineers typically don't use punctuation.

In the end, I don't think it matters much whether punctuation is used or not, so feel free to change it to whatever you like if you feel strongly about it. However, I do think that mere bullet points of formulas are generally bad style (regardless of punctuation). One should write in English and not in mathematical formulas. -- Jitse Niesen (talk) 00:43, 11 September 2005 (UTC)

Thanks for pointing me to that discussion. I'm all for full stops after actual sentences, but I'm wary of semicolons. So I had searched for 'semicolon', but did not find any discussion. I agree that good prose is preferable to mere lists of equations. However, the loosely related equations commonly found in the properties section are fine as far as I am concerned. In this case they are not part of a common sentence, so should not be linked with semicolons. Some math books use the semicolon as a form of 'postfix separator'. I think that's fine as long as one doesn't use both a prefix and a postfix separator. Originally, I had listed the properties without separators at all, someone else had introduced the bullets. --RainerBlome 20:50, 11 September 2005 (UTC)

The itemization now uses prose. --RainerBlome 19:21, 15 September 2005 (UTC)

Transposition vs. Conjugation

The article should be easy to use both for readers who are using real matrices and for readers who are using complex matrices. Currently, this is not so. Some formulas (limiting process, special cases, applications) use transposition and it is not obvious to me whether transpositon may be or has to be replaced by conjugate transposition. When I worked with the pseudoinverse, I never used complex-valued elements, so the question never came up. Does anyone see right away instances of $A^{T}A$ and such that can be replaced by $A^{*}A$ and such?

If the formulas using 'just' transposition are fine for complex matrices, that should be noted (but I don't know whether they are). In general, it should be clear where conjugate transpose is required in the complex case and where not. As long as this is not cleared up, at least there should be some kind of warning, that a formula might only hold in the real case. When I last left a similar warning, it was moved to the talk page. Is this standard practice?

In particular, the 'Special Cases' section gives $A^{+}=A^{T}(AA^{T})^{-1}$ whereas the 'Finding...' section gives $A^{+}=A^{*}(AA^{*})^{-1}$ . Are there any complex A for which the ^T version does not work? Or, since the pseudoinverse is unique, are the two equivalent?

--RainerBlome 20:30, 15 September 2005 (UTC)

Fixed. In the complex case, conjugate transpostion must always be used. Actually, I have a proof only for the first limit expression, but hey, the other one just can't be wrong, can it? --RainerBlome 20:13, 10 October 2005 (UTC)

Derivation ?

"The pseudoinverse is used to find the best solution (in a least squares sense) to systems of linear equations which are either overdetermined or underdetermined".

I think that overdetermined use left pseudo-inverse while, in undertermined system, we should use right pseudo-inverse, shouldn't we ?

Gtdj 15:18, 26 August 2006 (UTC)

The sentence that you cite was wrong. The method is also applicable to SLE that are neither overdetermined nor underdetermined. The pseudoinverse can be used as the "best" solution to an SLE that has no unique exact solution (either no solution or infinitely many solutions). The SLE (represented by matrix A) may be "square" (be neither over- nor underdetermined), but still have no unique exact solution, because some rows or some columns are linearly dependent. The more general statement is already present in the Introduction and Application sections, so I removed the sentence.

The pseudoinverse is unique, what do you mean by "right/left pseudo-inverse"? If the SLE is overdetermined, it could be that the columns of A are linearly independent (but they don't have to be). If they are, A+ is a left inverse of A (see section "Special_cases"). If the SLE is underdetermined, it could be that the rows of A are linearly independent (but they don't have to be). If they are, A+ is a right inverse of A. --RainerBlome 22:48, 9 September 2006 (UTC)

"The method is also applicable to SLE that are neither overdetermined nor underdetermined. The pseudoinverse can be used as the "best" solution to an SLE that has no unique exact solution (either no solution or infinitely many solutions)". As an engineer who wants to use the pseudoinverse to get my job done, I find the above statement very useful and I cannot understand why it is not included in the main article. --xerm (talk) 10:54, 19 February 2008 (UTC)

Question on one of the relations

I would like to have an elaboration on relation that was added (I think) on Sept 10, 2005. Under the "Properties" heading, a relation was added that says "Pseudoinverse [A] = Pseudoinverse [Transpose[A] . A] . Transpose [A]".

This is a useful result for my current work. The result is obviously true when "Transpose [A] . A" is invertible. I have looked through all my own references on linear algebra and have not seen the more general case anywhere else. I have not been able to come up with a proof for when "Transpose [A] . A" is non-invertible.

In the case where we use the pseudoinverse to eliminate one or more nearly singular values, does the result still hold true? Obviously we need to take care that we have eliminated "corresponding" near singular values from A and from "Transpose [A] . A". What about when we have replicated values on the diagonal, and decide to eliminate only one of them when computing the pseudoinverse?

John-seymour 19:47, 16 May 2007 (UTC)

Consider the SVD. If you're having trouble, read Ben-Israel and Greville, consult your favorite local mathematician, or ask at the math reference desk. Lunch 19:59, 16 May 2007 (UTC)

A proof for the relation in question can now be found on the Moore-Penrose pseudoinverse/Proofs page. --RainerBlome 21:28, 9 July 2007 (UTC)

Where should we put proofs?

The question of John-Seymour shows that using less-well known relations such as these is risky. The proof is easy to understand once you have it, but this does not help if a practical problem needs to be solved and no proof can be found immediately.

For this reason, I think that such proofs should be publicly documented somewhere, but where?

Currently, relations like these are collected in dedicated article sections. Should the proofs be added there? That would clutter the articles.
Put the proofs on the talk pages? Not good, they may easily get lost or damaged, and it would be a misuse of the talk pages.
Should there be a new kind of subpage? Sounds good to me.
In the end, Wiki books might be the best place for such fine detail. However, looking at wikibooks:Linear_Algebra, it is in a state which has not yet reached this kind of detail.

Has this been discussed before? Where? --RainerBlome 07:33, 5 July 2007 (UTC)

To answer my own question, this has been discussed before, mainly at Wikipedia:WikiProject_Mathematics/Proofs and also at Talk:Mathematical_proof#Proof_archive_-_Do_we_need_one?. I'm still reading those. Among others, they link to the german "Proof Archive", Wikipedia:Manual of Style (mathematics)#Proofs and Category:Article proofs.

It would be good if we could put the proofs in something like Metamath and link to there. That could help the "confidence in the correctness" aspect. However, because of their formality, the proofs on Metamath are not necessarily easy to understand, to put it mildly. And it may be a much larger task to formally add all the requirements of the proofs here before they themselves can be added. In particular, I did not see any proofs in matrix algebra, the closest I saw were proofs about vector spaces.

A Wikimedia place for human-readable proofs is needed. At the moment, I would prefer /Proofs pages, so I just created one: Moore-Penrose pseudoinverse/Proofs. It looks like this is not actually a MediaWiki subpage, but a separate page (with a slash in its name). Please comment. --RainerBlome 08:50, 5 July 2007 (UTC)

Continuity of limiting expressions

Lunch, in your comment to the edit of Pseudoinverse of 06:29, 10 January 2007, you wrote that the "underdetermined case needs to be added". In the underdetermined case, the task is to find any one of the set of best solutions - in other words "a vector that minimizes the euclidean norm of Ax-b". So the underdetermined case is dealt with, as I see it.

I reverted your edit to that section (and added a correction). When the system is underdetermined, there are many solutions (there is no unique minimizer). One needs to add the restriction that the solution of minimum norm is sought. (And the same applies if the system is overdetermined but there are still zero singular values.) Lunch

You also said that the "applications section needs work (among others)". Which aspect needs work the most, could you please elaborate? --RainerBlome 21:44, 30 June 2007 (UTC)

The "lemma" that's there doesn't belong in that section (but should probably be merged into another section of the text). But more importantly, the section needs to be expanded. Pseudoinverses have lots of applications. Sorry, I don't have much time for it. Lunch

Maybe you have time to add a few bullets to the To Do section above? --RainerBlome 07:46, 5 July 2007 (UTC)

Further, in the Special Cases section, you wrote that "the limiting expression is continuous at δ = 0; that is, we can simply substitute δ = 0 in the limiting expression". It is obvious that the value at the limit exists and is unique, but I do not see why continuity follows from this, or why it should hold in general. The argument seems the wrong way around to me. I may lack some basic analysis knowledge here. Anyway, I find the consideration of the limit expressions in these paragraphs more confusing than helpful, so I have just removed those sentences. --RainerBlome 22:40, 30 June 2007 (UTC)

It's probably better that you removed them; I'd agree they were cluttering things. However, they were correct, and again I think you need to consider the case where the matrix has zero singular values. (And I don't think it's "obvious"; your own confusion in the matter is some evidence of that.) Lunch 20:12, 3 July 2007 (UTC)

Incremental Algorithms

Regarding the incremental algorithms, I copied this deleted offer to here RainerBlome 21:09, 26 Aug 2005 (UTC):

I am hesitant to write these [algorithms] down here, as I am not sure whether they provide an advantage over SVD at all. When I worked with them, I was not aware of the SVD method or at least I don't remember having been. Send me a note and I may find the time to write them up for Wikipedia.

Moved this here from my talk page --RainerBlome 08:38, 17 September 2007 (UTC):

Hi, thank you for contributing to Moore-Penrose pseudoinverse. Would you possibly know of a more mainstream reference to incremental updates than the Tino Gramß 1992 dissertation, please? Preferably something more widely available, such as book by a major publisher, a journal article, or an online resource. This dissertation does not seem very useful to the reader. I tried to get hold of it and the closest I got to locating this source is a library card in Germany [1]. Actually getting one's hands on the info seems to be a bit tricky. Thanks! Jmath666 01:33, 14 September 2007 (UTC)

P.S. A quick search on MathSciNet shows that there should be "well-known" generalizations of the Sherman-Morrison formula for the pseudoinverses (cited in MR1958987) but I do not know what they are; it is a bit out of my area. Jmath666 01:42, 14 September 2007 (UTC)

As far as I know, the dissertation is the most detailed reference, so it should be most useful to the reader, once she gets hold of it (;-). No, I don't know of a book or free online references, but you might find commercial online ones. Personally, I have the TeX source (and a hardcopy), but just uploading it would probably violate copyright. In a corresponding research report, you can find more widely published references, but they may be just as hard to come by. These are the relevant references, as far as I can see:

Gramß, Tino: Fast algorithms to find invariant features for a word recognizing neural net. In: Proc. Second Int. Conf. Artificial Neural Networks (Bournemouth), London 1991, 180-184

- Word recognition with the Feature Finding Neural Network (FFNN). In: Neural Networks for Signal Processing, hrsg. v. B.H. Juang, S.Y. Kung, C.A. Kamm, New York 1991, 289-298

- Fast learning-algorithms for a self-optimizing neural network with an application to isolated word recognition. IEE Proc.-F 139, 1992, 391-396

However, since I had no need to, I never checked these. --RainerBlome 09:00, 17 September 2007 (UTC)

Regarding the Sherman-Morrison formula: It is indeed a central piece of the cited algorithms. Gramß's algorithms do not make use of any generalized form of it, they use the plain Sherman-Morrison formula, as applied to invertible matrices. The algorithms do not apply to an arbitrary matrix A (this is now stated in the article). They require that the correlation matrix ( $AA^{T}$ if m>=n, $A^{T}A$ if m<=n) is invertible, and Sherman-Morrison is applied to the correlation matrix. Equivalently, for m>=n the columns must be linearly independent, and for m<=n the rows must be linearly independent. These incremental algorithms were developed to accelerate the learning phase in pattern recognition algorithms. In this application, it is possible to ensure that the algorithm's requirements are met.

Regarding "well-known" generalizations of the Sherman-Morrison formula, for the pseudoinverses, that sounds interesting, what does MR1958987 refer to? Can you provide the references cited there? --RainerBlome 09:58, 17 September 2007 (UTC)

The abstract of the article I cited said "well-known". I looked around a bit more, to apply the Sherman-Morrison formula to full-rank

AA^{T}

or

A^{T}A

seems indeed to be standard if not obvious from the hints in the abstracts I saw. These were all articles on applying it to various things. I cannot provide good references immediately. One would need to follow up on the references in the articles that come up in the search on Sherman-Morrison and pseudoinverse, probably it will be in some older monographs, get a couple of those books, get copies of the original articles where this came up first, check it is indeed there, etc. I am not a specialist in pseudoinverses at all, if I were I would know immediately. It is routine, but it would take some time and legwork. Jmath666 00:42, 18 September 2007 (UTC)

[Jmath666, I took the liberty to reformat your following comment to make the references easier to see. RainerBlome]

I am sorry I did not answer your question. The MR number is unique ID of an article in ~~Mathsci~~ MathSciNet. The citation that said "well-known" is:

Ikramov, Kh. D. and Matin far, M., Rank-one modifications and the recalculation of pseudo-inverse matrices. Russian, translated in Moscow Univ. Comput. Math. Cybernet. 2002, no. 4, 10--18 (2003), MR1958987.

Other papers that might be relevant are:

Nayak, R. P.; Foudriat, Edwin C. Sequential parameter estimation using pseudoinverse. IEEE Trans. Automatic Control AC-19 (1974), 81--83,
An extension of the Sherman-Morrison identity to the case of rank-one modification of the full rank pseudoinverse matrix, Erokhin V. I, Computational mathematics and mathematical physics, 1999, vol. 39, no 8, pp. 1228-1230.
The classic Gene H. Golub and Charles F. Van Loan. Matrix Computations, Johns Hopkins, third edition, 1996, should be also checked. There seem to be some references in filtering and machine learning literature, of course, but hard to say what is in there without getting the fulltext. I'll ask Gene and some other people. Jmath666 01:26, 18 September 2007 (UTC)
One more: Ortigueira, M. D., A new algorithm for the factorization and inversion of recursively generated matrices Signal processing 52 (1): 65-73 1996.

I am sorry I cannot find the citations in the Ikramov paper, I do not have electronic access to the fulltext and the references list is not in any of the databases I checked. Jmath666 01:34, 18 September 2007 (UTC)

Got it. The database can be searched at MR Lookup, which generates links like the one I inserted in the reference you gave above. I searched for "Author: Ikramov; Title: Rank-one modifications" and got two results, by the way, the other being:

Ikramov, Kh. D.; Matin far, M.: Updating normal pseudosolutions under rank-one modifications of a matrix. (Russian) Zh. Vychisl. Mat. Mat. Fiz. 43 (2003), no. 4, 493--505; translation in Comput. Math. Math. Phys. 43 (2003), no. 4, 469--481. (MR1993678)

--RainerBlome 09:11, 19 September 2007 (UTC)

Interestingly, for "MR1993678" above, the Computational Mathematics and Mathematical Physics journal's own pages on that issue give a different title: Updating the Minimum-Norm Least-Squares Solution under Rank-One Modifications of a Matrix. Unfortunately the abstract does not indicate what their solution is. --RainerBlome 09:33, 19 September 2007 (UTC)

Translations of the title might differ. Gene says that the person who did the most on this is Carl Meyer of NC State but he hasn't seen any recent results:

The Moore-Penrose inverse of a bordered matrix, Linear Algebra and Its Appl., 5(1972), pp. 375-382. [2]
Generalized inversion of modified matrices, SIAM J. Appl. Math., 24(1973), pp. 315-323. [3]
Generalized inverses and ranks of block matrices, SIAM J. Appl. Math., 25(1973), pp.597-602. [4]

[I have added the links]. I have also noted on Mathscinet that the LAA paper cites

Greville, T. N. E., Some applications of the pseudoinverse of a matrix. SIAM Rev. 2 1960 15--22 [5]

which has a recursive algorithm per its Mathscinet review. From the reviews, it would seem that the algorithms should be more general than just applying the Sherman-Morrison formula to the inverse inside the expression in the full rank case.

Jmath666 01:46, 26 September 2007 (UTC)

I got the Ikramov paper. The review is wrong, there is no reference in the paper to a "well-known generalization of the Sherman Morrison formula to the pseudoinverse". The paper deal with Maple implementation the rank one update formula (which was developed in Meyer's papers cited above) and refers for it to theorem 3.1.3 in (the original Pitman 1979 edition of) Generalized Inverses of Linear Transformations by S. L. Campbell and C. D. Meyer ISBN 978-0486666938. Jmath666 02:43, 29 September 2007 (UTC)

Umm, yes, I found that out too, sorry for not having written about that. I wrote an email to Khakim Ikramov about his paper and he replied along the line that you gave, pointing me to the Campbell-Meyer book, which I promptly ordered (used). The book arrived a few days ago, and yes, there is a section about updated matrices. The book is slated as an introductory text on a field that is mature enough to not require to explicitly give references for each section - which they did not. It looks like you may have found them. From what I gathered so far, the algorithm they give is indeed more general than the full rank case, but still imposes restrictions on matrix A. --RainerBlome 01:02, 30 September 2007 (UTC)

They all get messy when A is not full rank. It is the nature of the beast. Lost of special cases. One more reference, from Binomial inverse theorem:

Kurt S. Riedel, A Sherman--Morrison--Woodbury Identity for Rank Augmenting Matrices with Application to Centering, SIAM Journal on Matrix Analysis and Applications, 13 (1992)659-662, DOI 10.1137/0613040 preprint

Jmath666 07:34, 30 September 2007 (UTC)

Incrememntal algorithms are useful where you have a large sample -- say 1,000,000 samples for 50 coefficients (i.e. you'd need to calculate a matrix of 50x1,000,000 or so). For example you have a lot of samples a_i(50) and desired result b_i; and you are looking for coefficients x(50) so that ax ~ b. The simplest thing you can do is to start with Ata(50x50)=0 and Atb(50)=0 and incrementally maintain A^TA and A^Tb by adding a_i^Ta_i diadic to Ata and a_i^Tb_i to atb for each sample. After the 1,000,000 samples, you can do the inverse of Ata and your least square x is (Ata)^-1Atb. Gligeti 13:12, 8 October 2007 (UTC)

Tikhonov

A^{+}=\lim _{\delta \to 0}(A^{*}A+\delta I)^{-1}A^{*}=\lim _{\delta \to 0}A^{*}(AA^{*}+\delta I)^{-1}

I think this is a misinterpretation of Tikhonov. I don't think the above would be convergent to anything if A is a singular square matrix (nxn with a rank k<n). Gligeti 13:12, 8 October 2007 (UTC)

Try to decompose A by SVD then take the limit... or give a counterexample, please. Jmath666 20:04, 8 October 2007 (UTC)

sorry: what confused me is that

(A^{*}A+\delta I)^{-1}

is clearly not convergent, its product with

A^{*}

is... My bad. Gligeti 09:34, 12 October 2007 (UTC)

Could you add that to the proofs sub page?Billlion 09:13, 16 November 2007 (UTC)

homegrown implementations

The sentence "Generally, homegrown implementations are to be avoided, in particular of SVD, if at all possible." could use additional explanation or a reference. As it stands now it reads like an opinion. I imagine that the reason has to do with numerical rounding issues and other considerations that mere mortal programmers will tend to get wrong --- the article should say so explicitly.

MusicScience (talk) 20:18, 2 February 2008 (UTC)

That's what we tell students. The reason is not only rounding, but this would be a major programming project, and high-quality implementations of SVD exist such as in Lapack. Jmath666 (talk) 01:35, 3 February 2008 (UTC)

Scalars and vectors

Does anybody know of any applications to this moore penrose vector inverse in traditional matrix algebra? As stated it's an interesting mathematical curiousity that yes one can define a vector inverse, however what would you do with it?

The subject of geometric algebra (a clifford algebra normally applied to real euclidian normed vector spaces) also has a vector inverse, and it has a fairly important place in that mathematics. I noticed that the moore penrose "vector inverse" has a very similar form as this moore penrose vector inverse, and had initially added the following note:

Note also that this is consistent with the  Geometric product vector inverse.

to this section since I thought it would put this "vector" inverse in a context where it would be more useful. However I've removed it since I think this comment (and the section in general) needs some context to make it useful. Its also not exactly true since the moore penrose vector inverse implies that its use is strictly in a dot product sense which isn't true for the GA vector inverse.

The geometric product vector inverse has the following form:

$a^{-1}={\frac {a}{a^{2}}}$

(note above that the geometric product for a euclidian vector with itself is it's squared length.)

One place that there is "use" of the moore penrose vector inverse that I can see is in projection onto a line. For example consider the following (GA) factorization of a vector into its projective and perpendicular (rejective) components:

a=v{\frac {1}{v}}a=v\left({\frac {1}{v}}\cdot a\right)+v\left({\frac {1}{v}}\wedge a\right)

a=v\left({\frac {v}{v^{2}}}\cdot a\right)+v\cdot \left({\frac {v}{v^{2}}}\wedge a\right)

This first part is the projection of a onto v, and in matrix notation one would write:

$Proj_{v}(a)=v\left({\frac {v^{\text{T}}}{v^{\text{T}}v}}a\right)=v(v^{+}a)=(vv^{+})a$

Unlike the GA vector inverse, whos associativity allowed for the projective derivation above, this MP vector inverse has only left action, so in the above, you can't further write:

$vv^{+}=1$

(ie: $vv^{+}$ is a projection matrix not scalar unity).

Peeter.joot (talk) 14:50, 13 April 2008 (UTC)

PseudoInverse of Partitioned Tuples

Is this known to be a trivial solution, and /or is the scaling of the SVD algorithm NP?

(* complete dictionary of words, one per row, number of rows is (alpha^word), using words of length "word" and an alphabet of "alpha" number of characters *) (* "PseudoInverse" is the canonical method of taking the pseudoinverse as per this article and the Mathematica function implemented hereinbelow *)

alpha=4;word=3;dict=.;

dict=Partition[Flatten[Tuples[Reverse[IdentityMatrix[alpha]],word]],(alpha*word)]

PseudoInverse[dict]==((Transpose[dict])*((alpha)^-(word -1)))-((word - 1)/(alpha^word)/word)

True

(* There is nothing special for alpha=4;word=3 - The method works for all cases where word < alpha, but that is a more verbose loop, and the Mathematica PseudoInverse function takes too long to calculate for values of alpha and word that are large, whereas the transpose side of the equation is fast. *)

Doug Youvan (talk) 18:25, 19 April 2008 (UTC)

Extra property?

From

A^{+}=(A^{*}A)^{+}A^{*}

,

A^{+}=A^{*}(AA^{*})^{+}

,

we can show

(A^{*}A)^{+}=A^{+}(A^{*})^{+}

,

(AA^{*})^{+}=(A^{*})^{+}A^{+}

.

Unlike the case of two different matrices, these relations do not require any conditions (on the row/column orthonormality of one matrix or on the ranks of both); they hold for general 'A'. Does anyone else think this is worth adding explicitly? Ged.R (talk) 17:33, 13 May 2008 (UTC)

Well I dont think we have to list every known propertyBilllion (talk) 07:38, 14 August 2008 (UTC)

redirect?

Pseudoinverse redirects to Generalized inverse, but it seems to me that it would be much better if it redirected to Moore-Penrose pseudoinverse, since this is the main use of the word, at least in my environment. Also, Talk:Pseudoinverse redirects to Talk:Moore-Penrose pseudoinverse. At the very least, the two should redirect to the same article/talk pair. --Zvika (talk) 06:47, 14 August 2008 (UTC)

That is the name given by Penrose Penrose: Roger (1955). "A generalized inverse for matrices", and the main reference on the topic as a whole is "Ben-Israel, Adi; Thomas N.E. Greville (2003). Generalized Inverses", so while fashions vary between disciplines I vote we go with the names given to it by the people who first described, popularized and analyzed the thing rather than subsequent users? Billlion (talk) 07:37, 14 August 2008 (UTC)

I think you misunderstood. I am not proposing a move, I only think that pseudoinverse should redirect to Moore-Penrose pseudoinverse rather than to generalized inverse. --Zvika (talk) 10:05, 14 August 2008 (UTC)

Since there seem to be no further objections, I have changed the redirect on Pseudoinverse. --Zvika (talk) 17:55, 18 August 2008 (UTC)

Ummm, I object. This page wasn't on my watch list.
There's a hat note at the top of Generalized inverse that directs the reader to Moore-Penrose pseudoinverse if that's what they're looking for. Isn't that good enough? Lunch (talk) 19:12, 20 August 2008 (UTC)

lapack

you just say "High quality implementations of SVD, QR, and back substitution are available in standard libraries, such as LAPACK." in the article, but do not explain which routines to use. so, could anyone explain which routines in lapack i have to use to calculate the Moore-Penrose-Pseudoinverse of a Matrix? --194.81.223.66 (talk) 08:17, 18 August 2008 (UTC)

In fact xGELSS and xGELSD compute the action of the Moore-Penrose pseudoinverse, that is, the solution of the least squares problem. Should one desire the pseudoinverse as an actual matrix (rarely needed) or program the action directly from SVD (xGESVD or xGESDD) this involves a bit of programming along the lines described in the article, see also the function pinv in Matlab. Also, a quick and dirty way is to compute the columns of the pseudoinverse as its action on the columns of identity matrix. For QR, xGEQRF to get the decomposition, xGEQRS to use it to solve least squares. Jmath666 (talk) 05:50, 12 September 2008 (UTC)

Finite number of steps

Also the sentence stating that the SVD cannot be computed in a finite number of steps must be wrong. In the SVD article it is stated that the SVD can be computed from the eigenvalue decomposition of A*A, which certainly can be computed in a finite number of steps. --145.116.239.129 (talk) 19:42, 10 September 2008 (UTC)

What makes you think that eigenvalue decompositions can be computed in a finite number of steps? -- Jitse Niesen (talk) 05:10, 11 September 2008 (UTC)

Wouldn't you say, though, that EVD can be calculated to any desired precision in a finite number of steps? --Zvika (talk) 06:23, 11 September 2008 (UTC)

Yes, I agree with that. -- Jitse Niesen (talk) 08:43, 11 September 2008 (UTC)

To be honest, then, I don't really see the point of that sentence. Any numerical algorithm is only as accurate as the rounding precision. I vote for just removing that sentence entirely. --Zvika (talk) 12:28, 11 September 2008 (UTC)

It does seem out of place here. I don't think this article should talk in detail about the cost of computing the SVD, so I removed the sentence. -- Jitse Niesen (talk) 03:57, 12 September 2008 (UTC)

There is a huge difference between algorithms like QR and Choleski which are direct methods and have clear operation counts, and SVD and eigenvalues which are by necessity iterative. The fact that SVD is competitive at all is close to magic - convergence has to be fast and stopping criteria have to be just right no matter what the data. Of course, if the user just wants to call a subroutine without understanding or curiosity what is going on, he won't care. Jmath666 (talk) 05:36, 12 September 2008 (UTC)

OK that sentence was to preempt possible objections that while QR has a clear finite operation count who knows how long SVD would take since it is iterative, so QR should be preferred (which is not always correct). That appears to be a too fine point so I agree there is no need to bring this up here. Jmath666 (talk) 19:12, 21 September 2008 (UTC)

Iterative method

I rewrote the recently added text on the iterative method. The old version read:

For regular

A_{0}

chosen such that

AA_{0}=(AA_{0})^{*}

and

\|A-AA_{0}A\|<\|A\|

, as well as for the choices

A_{0}:=\alpha A^{*}

or

A_{0}:={\tilde {\alpha }}(A^{*}A+\delta I)^{-1}A^{*}

(

0<{\tilde {\alpha }}<2

and

\delta >0

) the hyper-power sequence

A_{i+1}:=A_{i}+A_{i}\left(I-AA_{i}\right)

tends to the Moore–Penrose inverse,

A_{i}\rightarrow A^{+}

. For convergence notice that

I-A_{i}A=\left(I-A_{0}A\right)^{2^{i}}

.

I added some references and removed some parts which were not mentioned in the references: the alternative starting values, the name "hyper-power sequence" and the last sentence, which seems to show only that $A_{i}A\to I$ . This is in reply to concerns voiced by Zvika with I share. I'd hugely appreciate supporting references. -- Jitse Niesen (talk) 16:03, 22 April 2009 (UTC)

Definition

This may be the definition of a general pseuodo-inverse, but matricies A^+ satisfying these properties are not unique. Among such generalized inverses, I believe the the Moore-Penrose inverse is the unique such minimizing the Euclidean norm of A-preimages.

64.79.135.151 (talk) 19:24, 12 November 2009 (UTC)Ben Davis

I think you're wrong. This is the standard definition and it uniquely defines the Moore-Penrose pseudoinverse. Compare generalized inverse. --Zvika (talk) 09:45, 13 November 2009 (UTC)

I stand corrected -- My apologies; I overlooked the last of the four properties characterizing the Moore-Penrose pseudoinverse. —Preceding unsigned comment added by 64.79.135.151 (talk) 20:20, 23 December 2009 (UTC)

Clear statement of the Pseudo Inverse

Am I missing something?

The statement of the Pseudoinverse seems buried pages down this article.

Would it not help those seeking a quick resource to remind them of the equation for the pseudoinverse, I suspect this is probably the most common use of this page, to have it stated clearly and boldly at the head of the article.

Obviating the need to wade through reams of text to uncover the answer.

82.8.176.208 (talk) 14:21, 13 October 2010 (UTC) 82.8.176.208 (talk) 14:22, 13 October 2010 (UTC)

Cleanup

Initial notation section. Streamlined discussion to highlight each class of properties (existence, uniqueness, prthogonal projections, subspaces etc.) Suggest section on identities be removed - these are all consequences of four simple facts such as PA = A = AQ and A^+P = A^+ = QA+ which are easily proved. To see use, see Least-Squares and Minimum-Norm soltuion to Linear systems on proof page.Abelian (talk) 08:18, 27 October 2010 (UTC)

incorrect product formula

It was said in the "product identities" section that if A has full column rank and B has full row rank, then

(AB)^+ = B^+ * A^+

This is not true; it can easily be verified that if B = [1; 1] and A = [0 3; 0 4] then A has full column rank and B has full row rank, but the two sides are not equal. —Preceding unsigned comment added by 68.40.86.209 (talk) 21:44, 31 January 2011 (UTC)

Maybe I'm missing something here — If A = [0 3; 0 4], then its determinant is zero, so it doesn't have full column rank. Bad example? Duoduoduo (talk) 22:02, 31 January 2011 (UTC)

No, I wasn't missing anything -- the property was correct as stated, so I've reverted the edit that deleted it. With B having full row rank and A having full column rank, we have

B^{+}A^{+}=B^{T}(BB^{T})^{-1}(A^{T}A)^{-1}A^{T}

, which does in fact equal

(AB)^{+}

since it satisfies

(AB)(AB)^{+}(AB)=(AB)

. Duoduoduo (talk) 23:26, 31 January 2011 (UTC)

Sorry, the example was supposed to be A = [3 0; 0 4], B = [1; 1], so that AB = [3; 4].

A

is invertible and hence of full column and row rank;

B

is full row rank (its row space is dimension 1), so the property conditions are met. Using the formula

M^{+}=(M^{*}M)^{-1}M^{*}

,

(AB)^{+}

and

B^{+}

and

A^{+}

can all be computed, since

AB,A,B

all have linearly independent columns. But

(AB)^{+}=[3;4]^{+}={\frac {1}{25}}[3,4]

while

B^{+}=[1/2,1/2]

and

A^{+}=A^{-1}=[1/3,0;0,1/4]

, so

B^{+}A^{+}=[1/6;1/8]

. But your computations appear to be correct, so I'm confused about what went wrong here... 68.40.86.209 (talk) 02:06, 1 February 2011 (UTC)

Okay, I see the source of our confusion. It appears that you are taking "full row rank" to mean "linearly independent rows". The definition of "full rank" that I am familiar with is "rank equal to min(m,n), where the matrix has dimensions m x n". If you go to rank (linear algebra), you'll see this is the definition used there. Given that we have some confusion about the definition, perhaps the property you wish to include should say "linearly independent rows" rather than "full row rank". 68.40.86.209 (talk) 02:31, 1 February 2011 (UTC)

Nevermind, it appears that your usage of full rank is the accepted one. I was entirely off-base. Carry on :) 68.40.86.209 (talk) 03:45, 1 February 2011 (UTC)

An application

For the matrix A which is of $m\times n$ with m<n, and its rank is m, what is $A^{+}$ ? Do we have a formula for this? Do we have $A^{+}=\left(A^{T}A\right)^{-1}A^{T}$ ??

Jackzhp (talk) 02:35, 7 February 2011 (UTC)

The formula is

A^{+}=A^{*}(AA^{*})^{-1}\,\!.

See Moore–Penrose pseudoinverse#Linearly independent rows. For the real case this can be written

A^{+}=A^{T}(AA^{T})^{-1}\,\!.

Duoduoduo (talk) 15:52, 7 February 2011 (UTC)

Rings

The Notation section states: "The ring of $m\times n$ matrices over $\mathbb {K}$ is denoted by $M(m,n;\mathbb {K} )\,\!$ ." But unless I am much mistaken, $m\times n$ matrices do not form a ring unless m=n, as they can't be multiplied together. Should I change "ring" to "set"? Akwdb (talk) 17:01, 8 August 2011 (UTC)

Yes, I agree. Multiplication of non-square matrices of the same dimensions is not even defined. It is merely a group on

(\mathbb {K} ^{m\times n},+)

. Akshay Wed Aug 17 12:49:22 UTC 2011. —Preceding undated comment added 12:50, 17 August 2011 (UTC).

Consistent Pseudoinverse Notation

In several places, $A^{\dagger }$ is used to indicate the pseudoinverse of A, while in other places $A^{+}$ is used. It seems that the $A^{+}$ notation is more commonly seen in the internet / elsewhere. In any case, it would be good to choose one notation, and use it consistently throughout the article.

Golub and Van Loan use

A^{+}

while engineering texts seem to use

A^{\dagger }

. In any case, the partial transition to

A^{\dagger }

was carried out incompletely and without justification, so I am restoring the original version which used

A^{+}

. I agree that either one is fine and the important thing is consistency. --Zvika (talk) 06:24, 29 September 2009 (UTC)

Actually I think the

A^{+}

notation has only arisen as often the dagger symbol was not available. I would prefer to stick to dagger, unfortunately Gene Golub is no longer around to ask his opinion. Billlion (talk) 20:50, 12 February 2012 (UTC)

Unclear notation

It seems that the asterics A^* actually stands for taking transpose + complex conjugation. In many fields (particularly in physics) this notation is reseved for the complex conjugate alone, while the transposition + complex conjugation is denoted by the "dagger" $A^{\dagger }$ . Since this is very confusing, I think notation should be either changed or explained at the beginning of the article.Dan Gluck (talk) 13:33, 2 February 2010 (UTC)

The explanation you want is right there, since 2003: [6]. Using the dagger would cause even more confusion, since some texts use it to denote the Pseudoinverse. --RainerBlome (talk) 09:27, 2 September 2010 (UTC)

In the section on Geometric Interpretations, a "circle-plus" and "upside-down-T" or "bottom" notation are introduced without definition. It would be great to get either a brief explanation or a reference to another page that defines this notation. Gravy would be a reference to background and more lengthy discussion of this section. — Preceding unsigned comment added by Rebcabin (talk • contribs) 13:50, 24 April 2015 (UTC)

I've linked to the articles defining the notation. —Quondum 15:29, 24 April 2015 (UTC)

Please make only one change per edit

Understanding differences is much easier when every conceptual change is in its own edit.

For example, if you want to join two paragraphs, do it without changing their content (or anything else). And please write "joined paras" or something like that in the summary. :-) To change both content and markup, use one edit to change the markup, and a separate, second edit to change the content. --RainerBlome (talk) 09:31, 8 October 2015 (UTC)

Prefer to keep multi-line paras

Preferably, do not join lines that are already part of the same paragraph. MediaWiki markup allows us to use multiple lines to make up a paragraph. This is good. It allows to structure the source text, making it easier to understand and maintain. --RainerBlome (talk) 09:31, 8 October 2015 (UTC)

New article name

I had an administrator change the name of this article. I updated the name within and cited its sources, because it was not correct. The correct name is Moore-Penrose inverse, as in the sources.—Anita5192 (talk) 04:30, 11 October 2017 (UTC)