When I implemented the Hamiltonian network, I got some questions that I didn't understand.

When outputting Hamiltonian H by NN with input x = (q, p)
Let H = f (x) and let f be an NN with parameter θ.

Loss function
L = (∂H/∂p --dp/dt) ^ 2 + (∂H/∂q --dq/dt) ^ 2

I would like to define and ask how to calculate the gradient ∂L/∂θ at θ.

∂L/∂θ = (∂L/∂H) (∂H/∂p) (∂p/∂θ)

I want to calculate, but since neither p nor q is a function of θ, I think that ∂p/∂θ cannot be calculated.
How is the θ gradient of the loss function calculated?

It's a mathematical question, so it may be out of place, but I couldn't find any other place to ask, so I asked.