Your first idea is an assignment formulation. For $i = 1, \dots, N$ and $j = 1, \dots, S$ define $x_{ij}$ by:
$x_{ij} = \begin{cases} 1 & \text{if document } i \text{ is assigned to envelope type } j\\ 0 & \text{Otherwise} \end{cases}$
Furthermore, we also define the variable $a_j = \text{area of envelope } j$ for $j = 1, \dots, S$. This results in the
following model:
$\text{min } \sum_{i=1}^{N} \sum_{j=1}^{S} a_j x_{ij}$
$\text{s.t. } \sum_{j=1}^{S} x_{ij} = 1 \qquad i = 1, \dots, N \qquad \text{(assign)}$
$x_{ij} = 1 \text{ then } a_j \ge s^2_i \qquad i = 1, \dots, N; j = 1, \dots, S \qquad \text{(fit)}$
$x_{ij} \in \{0, 1\} \qquad i = 1, \dots, N; j = 1, \dots, S$
$a_j \ge 0 \qquad j = 1, \dots, S$
This mixed integer programming model (MIP) has two challenges: 1) the objective function is not linear
and 2) the (fit) constraint is still not written as a constraint.
Question 1
Introduce the variable $y_{ij}$ defined by $y_{ij} = a_j x_{ij}$ and use this to establish a model with a linear
objective and linear constraints.