I'm currently studying stochastic processes for the first time in the context of physics (Langevin dynamics), and I've come across a few conceptual difficulties regarding the Fokker-Planck equation which I want to clear up. The general form I'm looking at is:
$\frac{\partial p}{\partial t} = - \frac{\partial }{\partial x} \left( \mu(x,t) \ p \right) + \frac{\partial^2 }{\partial x^2} \left( D(x,t) \ p \right).$
The boundary condition is absorption at infinity: the probability current $j(x,t) \equiv \mu (x,t) p - \frac{\partial }{\partial x} \left( D(x,t) \ p \right) \to 0, \quad |x| \to \infty \quad \forall t.$
Although this isn't really relevant to my questions.
First of all, what is the function $p$ in the equation? In the notes I'm using, it's the conditional pdf: $p = p(x,t|x_0,t_0)$ (this is actually how the equation is derived). Elsewhere though (e.g. wikipedia), it stands for the regular pdf: $p = p(x,t)$.
Question 1. Which is it? I am tempted to say it doesn't matter, because by the total probability rule,
$ p(x,t) = \int dx' dt'\ p(x,t|x',t')p(x',t'),$
so I can multiply the PDE through $p(x',t')$ and integrate over $x'$ and $t'$, transforming the conditional pdf to the regular pdf. If it's the regular pdf, then I can pull the $p(x',t')$ and the integrals in front and argue that the equation is true for any $p(x',t')$, thus transforming the regular pdf into the conditional pdf. Is this argument correct?
Of course what will change are the initial conditions, which brings me to my next point.
My notes only consider processes that are time translation invariant, so $\mu$ and $D$ have no time dependence, and $p(x,t|x_0,t_0) = p(x,t-t_0|x_0,0) \equiv p(x,t-t_0|x_0)$. Then the initial condition (for the conditional pdf) is:
$p(x,0|x_0) = \delta(x-x_0).$
After solving this, I know $p(x,t|x_0)$, so by the total probability rule I can find $p(x,t)$ by integrating over all values of $x_0$, provided I know their initial distribution $g(x_0)$:
$p(x,t) = \int dx_0 \ p(x,t|x_0) g(x_0)$
On the other hand, if the PDE is for the regular pdf, the initial condition is:
$p(x,0) = g (x),$
Solving this PDE of course gives $p(x,t)$ directly.
Question 2. How do I extend this to non-stationary processes? For the second case (regular pdf) everything stays the same, I think. But in the first case (conditional pdf), which initial condition do I want?
$p(x,0|x_0,t_0) = \ ?$
Also, how do I retrieve $p(x,t)$ after solving for $p(x,t|x_0,t_0)$ ? I have the initial distribution $g(x_0)$, but I can't use the total probability theorem as before. It seems like there's one time variable missing in all this? Should my new $g$ be a function of two variables?
Question 3. The conditional pdf for the stationary process $p(x,t|x_0)$ appears to have the interpretation of a Green's function (integrating it with the initial condition $g(x)$ yields the sought total pdf). But the Fokker-Planck equation isn't of the form $L p(x,t) = g(x),$ with some linear differential operator $L$. In fact, the Fokker-Planck equation is homogeneous. So, for which operator (PDE), if any, is $p(x,t|x_0)$ the Green's function? I think maybe I understand Green's functions wrong...
EDIT: I put up a bounty because I'm looking for an answer which would specifically address the three questions posed in detail.