We consider the problem of loss-augmented conditional random fields, which subsume both structural SVMs based on the corresponding graph and more classical CRFs, based on the usual convex version of the Bethe relaxation over the local polytope. We propose a general dual augmented Lagrangian formulation which retrieves as a special case the form of smoothing proposed by Meshi et al. (2015a) in the context of MAP inference. First, we implement a proximal stochastic dual coordinate ascent algorithm for this last formulation, explicit the convergence rate guarantees that are implied by the results of Shalev-Shwartz and Zhang (2016) and show that its empirical performance compares favorably to block coordinate Frank-Wolfe for maximum likelihood learning. Using the block coordinate method of multipliers algorithm of (Hong et al., 2014), we then show that it is possible to solve almost as efficiently the formulation in which the local polytope are not relaxed and illustrate empirically that this can yield to improved performance.