Are there cases where fontenc + luatex (or xetex) cause problems?
According to the luatex docs, you shouldn't use fontenc
with luatex. People swear up and down that fontenc is incompatible, but I haven't been able to find an example where loading the package causes problems.
(I'm curious about this because it can be easier to load the same base set of packages for pdftex/luatex/xetex, and add fontspec
only when handling the latter two.)
I know fontenc
is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?
fonts xetex luatex
add a comment |
According to the luatex docs, you shouldn't use fontenc
with luatex. People swear up and down that fontenc is incompatible, but I haven't been able to find an example where loading the package causes problems.
(I'm curious about this because it can be easier to load the same base set of packages for pdftex/luatex/xetex, and add fontspec
only when handling the latter two.)
I know fontenc
is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?
fonts xetex luatex
2
Great first question! I learned some new things by researching the answer. Thanks.
– Davislor
Jan 20 at 4:21
examples how to use both traditional font encodings and OpenType fonts in same document are given there
– user4686
Jan 20 at 8:54
add a comment |
According to the luatex docs, you shouldn't use fontenc
with luatex. People swear up and down that fontenc is incompatible, but I haven't been able to find an example where loading the package causes problems.
(I'm curious about this because it can be easier to load the same base set of packages for pdftex/luatex/xetex, and add fontspec
only when handling the latter two.)
I know fontenc
is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?
fonts xetex luatex
According to the luatex docs, you shouldn't use fontenc
with luatex. People swear up and down that fontenc is incompatible, but I haven't been able to find an example where loading the package causes problems.
(I'm curious about this because it can be easier to load the same base set of packages for pdftex/luatex/xetex, and add fontspec
only when handling the latter two.)
I know fontenc
is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?
fonts xetex luatex
fonts xetex luatex
edited Jan 20 at 9:10
Joseph Wright♦
204k22559886
204k22559886
asked Jan 20 at 3:10
karldwkarldw
884
884
2
Great first question! I learned some new things by researching the answer. Thanks.
– Davislor
Jan 20 at 4:21
examples how to use both traditional font encodings and OpenType fonts in same document are given there
– user4686
Jan 20 at 8:54
add a comment |
2
Great first question! I learned some new things by researching the answer. Thanks.
– Davislor
Jan 20 at 4:21
examples how to use both traditional font encodings and OpenType fonts in same document are given there
– user4686
Jan 20 at 8:54
2
2
Great first question! I learned some new things by researching the answer. Thanks.
– Davislor
Jan 20 at 4:21
Great first question! I learned some new things by researching the answer. Thanks.
– Davislor
Jan 20 at 4:21
examples how to use both traditional font encodings and OpenType fonts in same document are given there
– user4686
Jan 20 at 8:54
examples how to use both traditional font encodings and OpenType fonts in same document are given there
– user4686
Jan 20 at 8:54
add a comment |
3 Answers
3
active
oldest
votes
fontenc
is loaded by fontspec
(you can check this in the log). So in itself the package is not a problem.
But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.
documentclass{article}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
begin{document}
encodingdefault, makeatletter f@encodingmakeatother
end{document}
will load t1enc.def
, lgrenc.def
and t2enc.def
.
This also is not problematic with lualatex and xelatex.
But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU
encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:
documentclass{article}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
usepackage{fontspec}
setmainfont{DejaVuSans}
begin{document}
encodingdefault, makeatletter f@encodingmakeatother
Grüße, αβγ, Ҍҋ
{fontencoding{T1}selectfont
Grüße, αβγ, Ҍҋ}
{fontencoding{LGR}selectfont
Grüße, αβγ, Ҍҋ}
{fontencoding{T2A}selectfont
Grüße, αβγ, Ҍҋ}
end{document}
So you can use fontenc
in your document (I need it to use chessfonts
), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:
documentclass{article}
usepackage{fontspec}
setmainfont{DejaVuSans}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
%
begin{document}% wrong, encoding is T2A
Moving the setmainfont
resolves the problem:
documentclass{article}
usepackage{fontspec}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
setmainfont{DejaVuSans}
begin{document} %encoding is TU now
The solution edited in at the bottom does not work correctly. The last three lines display as:Grüße, ,
ΓρῤΫε, ,
GrьЯe, ,
. This is because non-ASCII characters are rendered inactive.
– Davislor
Jan 20 at 12:14
I took the liberty of replacing the fix at the end with one that really does work. Mostly.
– Davislor
Jan 20 at 13:38
@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.
– Ulrike Fischer
Jan 20 at 13:41
1
As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.
– Davislor
Jan 20 at 14:01
2
@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.
– Ulrike Fischer
Jan 20 at 14:10
|
show 6 more comments
An Example that Might Bite You
It can cause problems if you load both fontspec
and fontenc
together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.
documentclass[varwidth, preview]{standalone}
usepackage[spanish]{babel}
% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]{japanese}
% Implicitly causes babel to load fontspec:
babelfont[japanese]{rm}{Noto Sans CJK JP}
% Implicitly loads fontenc with [T1]:
usepackage[poorman]{fourier}
begin{document}
¿Es foreignlanguage{japanese}{日本} Utopía?
end{document}
Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec
renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont
after fourier
, you would instead set the main font to Latin Modern Roman.
The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc
breaks, you can stop reading.
How to Combine Unicode and Legacy Fonts
I’m not judging. Sometimes I don’t get to set the requirements.
To fix this example, load luainputenc
, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:
documentclass[varwidth, preview]{standalone}
usepackage[T1]{fontenc}
usepackage{textcomp}
usepackage[utf8]{luainputenc} % Needed to mix NFSS and Unicode
usepackage[spanish]{babel}
usepackage[no-math]{fontspec}
defaultfontfeatures{ Scale = MatchUppercase }
newfontfamilyjapanesefont{Noto Serif CJK JP}[
Language = Japanese,
Script = Kana ]
newcommandtextjapanese[1]{{japanesefont #1}}
usepackage[poorman]{fourier}
begin{document}
¿Es textjapanese{日本} Utopía?
end{document}
A Better Solution
A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:
documentclass[varwidth, preview]{standalone}
usepackage{polyglossia}
setdefaultlanguage{spanish}
defaultfontfeatures{ Scale = MatchUppercase, Ligatures = TeX }
setmainfont{Lingua Franca}[
Scale = 1.0 ,
Ligatures = Common ,
Numbers = OldStyle ]
newfontfamilyjapanesefont{Noto Serif CJK JP}[
Language = Japanese,
Script = Kana ]
newcommandtextjapanese[1]{{japanesefont #1}}
begin{document}
¿Es textjapanese{日本} Utopía?
end{document}
This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.
I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)
– David Carlisle
Jan 20 at 22:59
@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.
– Davislor
Jan 21 at 0:01
@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown
, or the slot in a specific encoding. Loadingfontspec
and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loadingluainputenc
turns it back on.
– Davislor
Jan 21 at 0:12
No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified asusepackage[latin1]{inputenc}
orusepackage[utf8]{inputenc}
or whatever encoding is in use. So¿
is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)
– David Carlisle
Jan 21 at 0:46
@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you loadfontenc
but notfontspec
, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly loadinputenc
orselinput
. If you load bothfontenc
andfontspec
, these characters will not be activated and some of them will break.
– Davislor
Jan 21 at 1:09
add a comment |
Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecoding{T1}selectfont
is the real problem, although that is most commonly activated by
usepackage[T1][fontenc}
so it is simplest to tell people not to use fontenc
.
In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.
The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.
Do you know an example where the hyphenation goes wrong due tofontenc
andT1
? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730
– moewe
Jan 20 at 12:51
@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts
– David Carlisle
Jan 20 at 13:25
I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)
– moewe
Jan 20 at 13:36
1
@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.
– David Carlisle
Feb 5 at 14:20
1
@moewe well naturally it gets better withgrüßen
- in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen
) or bad hyphenation with good output (grüss en
).
– Ulrike Fischer
Feb 5 at 14:30
|
show 4 more comments
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "85"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f470976%2fare-there-cases-where-fontenc-luatex-or-xetex-cause-problems%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
fontenc
is loaded by fontspec
(you can check this in the log). So in itself the package is not a problem.
But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.
documentclass{article}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
begin{document}
encodingdefault, makeatletter f@encodingmakeatother
end{document}
will load t1enc.def
, lgrenc.def
and t2enc.def
.
This also is not problematic with lualatex and xelatex.
But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU
encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:
documentclass{article}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
usepackage{fontspec}
setmainfont{DejaVuSans}
begin{document}
encodingdefault, makeatletter f@encodingmakeatother
Grüße, αβγ, Ҍҋ
{fontencoding{T1}selectfont
Grüße, αβγ, Ҍҋ}
{fontencoding{LGR}selectfont
Grüße, αβγ, Ҍҋ}
{fontencoding{T2A}selectfont
Grüße, αβγ, Ҍҋ}
end{document}
So you can use fontenc
in your document (I need it to use chessfonts
), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:
documentclass{article}
usepackage{fontspec}
setmainfont{DejaVuSans}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
%
begin{document}% wrong, encoding is T2A
Moving the setmainfont
resolves the problem:
documentclass{article}
usepackage{fontspec}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
setmainfont{DejaVuSans}
begin{document} %encoding is TU now
The solution edited in at the bottom does not work correctly. The last three lines display as:Grüße, ,
ΓρῤΫε, ,
GrьЯe, ,
. This is because non-ASCII characters are rendered inactive.
– Davislor
Jan 20 at 12:14
I took the liberty of replacing the fix at the end with one that really does work. Mostly.
– Davislor
Jan 20 at 13:38
@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.
– Ulrike Fischer
Jan 20 at 13:41
1
As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.
– Davislor
Jan 20 at 14:01
2
@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.
– Ulrike Fischer
Jan 20 at 14:10
|
show 6 more comments
fontenc
is loaded by fontspec
(you can check this in the log). So in itself the package is not a problem.
But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.
documentclass{article}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
begin{document}
encodingdefault, makeatletter f@encodingmakeatother
end{document}
will load t1enc.def
, lgrenc.def
and t2enc.def
.
This also is not problematic with lualatex and xelatex.
But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU
encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:
documentclass{article}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
usepackage{fontspec}
setmainfont{DejaVuSans}
begin{document}
encodingdefault, makeatletter f@encodingmakeatother
Grüße, αβγ, Ҍҋ
{fontencoding{T1}selectfont
Grüße, αβγ, Ҍҋ}
{fontencoding{LGR}selectfont
Grüße, αβγ, Ҍҋ}
{fontencoding{T2A}selectfont
Grüße, αβγ, Ҍҋ}
end{document}
So you can use fontenc
in your document (I need it to use chessfonts
), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:
documentclass{article}
usepackage{fontspec}
setmainfont{DejaVuSans}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
%
begin{document}% wrong, encoding is T2A
Moving the setmainfont
resolves the problem:
documentclass{article}
usepackage{fontspec}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
setmainfont{DejaVuSans}
begin{document} %encoding is TU now
The solution edited in at the bottom does not work correctly. The last three lines display as:Grüße, ,
ΓρῤΫε, ,
GrьЯe, ,
. This is because non-ASCII characters are rendered inactive.
– Davislor
Jan 20 at 12:14
I took the liberty of replacing the fix at the end with one that really does work. Mostly.
– Davislor
Jan 20 at 13:38
@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.
– Ulrike Fischer
Jan 20 at 13:41
1
As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.
– Davislor
Jan 20 at 14:01
2
@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.
– Ulrike Fischer
Jan 20 at 14:10
|
show 6 more comments
fontenc
is loaded by fontspec
(you can check this in the log). So in itself the package is not a problem.
But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.
documentclass{article}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
begin{document}
encodingdefault, makeatletter f@encodingmakeatother
end{document}
will load t1enc.def
, lgrenc.def
and t2enc.def
.
This also is not problematic with lualatex and xelatex.
But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU
encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:
documentclass{article}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
usepackage{fontspec}
setmainfont{DejaVuSans}
begin{document}
encodingdefault, makeatletter f@encodingmakeatother
Grüße, αβγ, Ҍҋ
{fontencoding{T1}selectfont
Grüße, αβγ, Ҍҋ}
{fontencoding{LGR}selectfont
Grüße, αβγ, Ҍҋ}
{fontencoding{T2A}selectfont
Grüße, αβγ, Ҍҋ}
end{document}
So you can use fontenc
in your document (I need it to use chessfonts
), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:
documentclass{article}
usepackage{fontspec}
setmainfont{DejaVuSans}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
%
begin{document}% wrong, encoding is T2A
Moving the setmainfont
resolves the problem:
documentclass{article}
usepackage{fontspec}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
setmainfont{DejaVuSans}
begin{document} %encoding is TU now
fontenc
is loaded by fontspec
(you can check this in the log). So in itself the package is not a problem.
But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.
documentclass{article}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
begin{document}
encodingdefault, makeatletter f@encodingmakeatother
end{document}
will load t1enc.def
, lgrenc.def
and t2enc.def
.
This also is not problematic with lualatex and xelatex.
But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU
encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:
documentclass{article}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
usepackage{fontspec}
setmainfont{DejaVuSans}
begin{document}
encodingdefault, makeatletter f@encodingmakeatother
Grüße, αβγ, Ҍҋ
{fontencoding{T1}selectfont
Grüße, αβγ, Ҍҋ}
{fontencoding{LGR}selectfont
Grüße, αβγ, Ҍҋ}
{fontencoding{T2A}selectfont
Grüße, αβγ, Ҍҋ}
end{document}
So you can use fontenc
in your document (I need it to use chessfonts
), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:
documentclass{article}
usepackage{fontspec}
setmainfont{DejaVuSans}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
%
begin{document}% wrong, encoding is T2A
Moving the setmainfont
resolves the problem:
documentclass{article}
usepackage{fontspec}
usepackage[T1]{fontenc}
usepackage[LGR]{fontenc}
usepackage[T2A]{fontenc}
setmainfont{DejaVuSans}
begin{document} %encoding is TU now
edited Jan 23 at 9:53
Joseph Wright♦
204k22559886
204k22559886
answered Jan 20 at 8:40
Ulrike FischerUlrike Fischer
191k8298681
191k8298681
The solution edited in at the bottom does not work correctly. The last three lines display as:Grüße, ,
ΓρῤΫε, ,
GrьЯe, ,
. This is because non-ASCII characters are rendered inactive.
– Davislor
Jan 20 at 12:14
I took the liberty of replacing the fix at the end with one that really does work. Mostly.
– Davislor
Jan 20 at 13:38
@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.
– Ulrike Fischer
Jan 20 at 13:41
1
As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.
– Davislor
Jan 20 at 14:01
2
@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.
– Ulrike Fischer
Jan 20 at 14:10
|
show 6 more comments
The solution edited in at the bottom does not work correctly. The last three lines display as:Grüße, ,
ΓρῤΫε, ,
GrьЯe, ,
. This is because non-ASCII characters are rendered inactive.
– Davislor
Jan 20 at 12:14
I took the liberty of replacing the fix at the end with one that really does work. Mostly.
– Davislor
Jan 20 at 13:38
@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.
– Ulrike Fischer
Jan 20 at 13:41
1
As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.
– Davislor
Jan 20 at 14:01
2
@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.
– Ulrike Fischer
Jan 20 at 14:10
The solution edited in at the bottom does not work correctly. The last three lines display as:
Grüße, ,
ΓρῤΫε, ,
GrьЯe, ,
. This is because non-ASCII characters are rendered inactive.– Davislor
Jan 20 at 12:14
The solution edited in at the bottom does not work correctly. The last three lines display as:
Grüße, ,
ΓρῤΫε, ,
GrьЯe, ,
. This is because non-ASCII characters are rendered inactive.– Davislor
Jan 20 at 12:14
I took the liberty of replacing the fix at the end with one that really does work. Mostly.
– Davislor
Jan 20 at 13:38
I took the liberty of replacing the fix at the end with one that really does work. Mostly.
– Davislor
Jan 20 at 13:38
@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.
– Ulrike Fischer
Jan 20 at 13:41
@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.
– Ulrike Fischer
Jan 20 at 13:41
1
1
As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.
– Davislor
Jan 20 at 14:01
As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.
– Davislor
Jan 20 at 14:01
2
2
@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.
– Ulrike Fischer
Jan 20 at 14:10
@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.
– Ulrike Fischer
Jan 20 at 14:10
|
show 6 more comments
An Example that Might Bite You
It can cause problems if you load both fontspec
and fontenc
together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.
documentclass[varwidth, preview]{standalone}
usepackage[spanish]{babel}
% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]{japanese}
% Implicitly causes babel to load fontspec:
babelfont[japanese]{rm}{Noto Sans CJK JP}
% Implicitly loads fontenc with [T1]:
usepackage[poorman]{fourier}
begin{document}
¿Es foreignlanguage{japanese}{日本} Utopía?
end{document}
Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec
renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont
after fourier
, you would instead set the main font to Latin Modern Roman.
The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc
breaks, you can stop reading.
How to Combine Unicode and Legacy Fonts
I’m not judging. Sometimes I don’t get to set the requirements.
To fix this example, load luainputenc
, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:
documentclass[varwidth, preview]{standalone}
usepackage[T1]{fontenc}
usepackage{textcomp}
usepackage[utf8]{luainputenc} % Needed to mix NFSS and Unicode
usepackage[spanish]{babel}
usepackage[no-math]{fontspec}
defaultfontfeatures{ Scale = MatchUppercase }
newfontfamilyjapanesefont{Noto Serif CJK JP}[
Language = Japanese,
Script = Kana ]
newcommandtextjapanese[1]{{japanesefont #1}}
usepackage[poorman]{fourier}
begin{document}
¿Es textjapanese{日本} Utopía?
end{document}
A Better Solution
A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:
documentclass[varwidth, preview]{standalone}
usepackage{polyglossia}
setdefaultlanguage{spanish}
defaultfontfeatures{ Scale = MatchUppercase, Ligatures = TeX }
setmainfont{Lingua Franca}[
Scale = 1.0 ,
Ligatures = Common ,
Numbers = OldStyle ]
newfontfamilyjapanesefont{Noto Serif CJK JP}[
Language = Japanese,
Script = Kana ]
newcommandtextjapanese[1]{{japanesefont #1}}
begin{document}
¿Es textjapanese{日本} Utopía?
end{document}
This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.
I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)
– David Carlisle
Jan 20 at 22:59
@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.
– Davislor
Jan 21 at 0:01
@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown
, or the slot in a specific encoding. Loadingfontspec
and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loadingluainputenc
turns it back on.
– Davislor
Jan 21 at 0:12
No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified asusepackage[latin1]{inputenc}
orusepackage[utf8]{inputenc}
or whatever encoding is in use. So¿
is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)
– David Carlisle
Jan 21 at 0:46
@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you loadfontenc
but notfontspec
, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly loadinputenc
orselinput
. If you load bothfontenc
andfontspec
, these characters will not be activated and some of them will break.
– Davislor
Jan 21 at 1:09
add a comment |
An Example that Might Bite You
It can cause problems if you load both fontspec
and fontenc
together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.
documentclass[varwidth, preview]{standalone}
usepackage[spanish]{babel}
% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]{japanese}
% Implicitly causes babel to load fontspec:
babelfont[japanese]{rm}{Noto Sans CJK JP}
% Implicitly loads fontenc with [T1]:
usepackage[poorman]{fourier}
begin{document}
¿Es foreignlanguage{japanese}{日本} Utopía?
end{document}
Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec
renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont
after fourier
, you would instead set the main font to Latin Modern Roman.
The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc
breaks, you can stop reading.
How to Combine Unicode and Legacy Fonts
I’m not judging. Sometimes I don’t get to set the requirements.
To fix this example, load luainputenc
, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:
documentclass[varwidth, preview]{standalone}
usepackage[T1]{fontenc}
usepackage{textcomp}
usepackage[utf8]{luainputenc} % Needed to mix NFSS and Unicode
usepackage[spanish]{babel}
usepackage[no-math]{fontspec}
defaultfontfeatures{ Scale = MatchUppercase }
newfontfamilyjapanesefont{Noto Serif CJK JP}[
Language = Japanese,
Script = Kana ]
newcommandtextjapanese[1]{{japanesefont #1}}
usepackage[poorman]{fourier}
begin{document}
¿Es textjapanese{日本} Utopía?
end{document}
A Better Solution
A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:
documentclass[varwidth, preview]{standalone}
usepackage{polyglossia}
setdefaultlanguage{spanish}
defaultfontfeatures{ Scale = MatchUppercase, Ligatures = TeX }
setmainfont{Lingua Franca}[
Scale = 1.0 ,
Ligatures = Common ,
Numbers = OldStyle ]
newfontfamilyjapanesefont{Noto Serif CJK JP}[
Language = Japanese,
Script = Kana ]
newcommandtextjapanese[1]{{japanesefont #1}}
begin{document}
¿Es textjapanese{日本} Utopía?
end{document}
This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.
I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)
– David Carlisle
Jan 20 at 22:59
@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.
– Davislor
Jan 21 at 0:01
@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown
, or the slot in a specific encoding. Loadingfontspec
and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loadingluainputenc
turns it back on.
– Davislor
Jan 21 at 0:12
No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified asusepackage[latin1]{inputenc}
orusepackage[utf8]{inputenc}
or whatever encoding is in use. So¿
is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)
– David Carlisle
Jan 21 at 0:46
@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you loadfontenc
but notfontspec
, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly loadinputenc
orselinput
. If you load bothfontenc
andfontspec
, these characters will not be activated and some of them will break.
– Davislor
Jan 21 at 1:09
add a comment |
An Example that Might Bite You
It can cause problems if you load both fontspec
and fontenc
together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.
documentclass[varwidth, preview]{standalone}
usepackage[spanish]{babel}
% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]{japanese}
% Implicitly causes babel to load fontspec:
babelfont[japanese]{rm}{Noto Sans CJK JP}
% Implicitly loads fontenc with [T1]:
usepackage[poorman]{fourier}
begin{document}
¿Es foreignlanguage{japanese}{日本} Utopía?
end{document}
Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec
renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont
after fourier
, you would instead set the main font to Latin Modern Roman.
The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc
breaks, you can stop reading.
How to Combine Unicode and Legacy Fonts
I’m not judging. Sometimes I don’t get to set the requirements.
To fix this example, load luainputenc
, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:
documentclass[varwidth, preview]{standalone}
usepackage[T1]{fontenc}
usepackage{textcomp}
usepackage[utf8]{luainputenc} % Needed to mix NFSS and Unicode
usepackage[spanish]{babel}
usepackage[no-math]{fontspec}
defaultfontfeatures{ Scale = MatchUppercase }
newfontfamilyjapanesefont{Noto Serif CJK JP}[
Language = Japanese,
Script = Kana ]
newcommandtextjapanese[1]{{japanesefont #1}}
usepackage[poorman]{fourier}
begin{document}
¿Es textjapanese{日本} Utopía?
end{document}
A Better Solution
A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:
documentclass[varwidth, preview]{standalone}
usepackage{polyglossia}
setdefaultlanguage{spanish}
defaultfontfeatures{ Scale = MatchUppercase, Ligatures = TeX }
setmainfont{Lingua Franca}[
Scale = 1.0 ,
Ligatures = Common ,
Numbers = OldStyle ]
newfontfamilyjapanesefont{Noto Serif CJK JP}[
Language = Japanese,
Script = Kana ]
newcommandtextjapanese[1]{{japanesefont #1}}
begin{document}
¿Es textjapanese{日本} Utopía?
end{document}
This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.
An Example that Might Bite You
It can cause problems if you load both fontspec
and fontenc
together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.
documentclass[varwidth, preview]{standalone}
usepackage[spanish]{babel}
% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]{japanese}
% Implicitly causes babel to load fontspec:
babelfont[japanese]{rm}{Noto Sans CJK JP}
% Implicitly loads fontenc with [T1]:
usepackage[poorman]{fourier}
begin{document}
¿Es foreignlanguage{japanese}{日本} Utopía?
end{document}
Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec
renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont
after fourier
, you would instead set the main font to Latin Modern Roman.
The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc
breaks, you can stop reading.
How to Combine Unicode and Legacy Fonts
I’m not judging. Sometimes I don’t get to set the requirements.
To fix this example, load luainputenc
, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:
documentclass[varwidth, preview]{standalone}
usepackage[T1]{fontenc}
usepackage{textcomp}
usepackage[utf8]{luainputenc} % Needed to mix NFSS and Unicode
usepackage[spanish]{babel}
usepackage[no-math]{fontspec}
defaultfontfeatures{ Scale = MatchUppercase }
newfontfamilyjapanesefont{Noto Serif CJK JP}[
Language = Japanese,
Script = Kana ]
newcommandtextjapanese[1]{{japanesefont #1}}
usepackage[poorman]{fourier}
begin{document}
¿Es textjapanese{日本} Utopía?
end{document}
A Better Solution
A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:
documentclass[varwidth, preview]{standalone}
usepackage{polyglossia}
setdefaultlanguage{spanish}
defaultfontfeatures{ Scale = MatchUppercase, Ligatures = TeX }
setmainfont{Lingua Franca}[
Scale = 1.0 ,
Ligatures = Common ,
Numbers = OldStyle ]
newfontfamilyjapanesefont{Noto Serif CJK JP}[
Language = Japanese,
Script = Kana ]
newcommandtextjapanese[1]{{japanesefont #1}}
begin{document}
¿Es textjapanese{日本} Utopía?
end{document}
This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.
edited Jan 20 at 22:13
answered Jan 20 at 4:11
DavislorDavislor
6,3011228
6,3011228
I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)
– David Carlisle
Jan 20 at 22:59
@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.
– Davislor
Jan 21 at 0:01
@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown
, or the slot in a specific encoding. Loadingfontspec
and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loadingluainputenc
turns it back on.
– Davislor
Jan 21 at 0:12
No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified asusepackage[latin1]{inputenc}
orusepackage[utf8]{inputenc}
or whatever encoding is in use. So¿
is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)
– David Carlisle
Jan 21 at 0:46
@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you loadfontenc
but notfontspec
, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly loadinputenc
orselinput
. If you load bothfontenc
andfontspec
, these characters will not be activated and some of them will break.
– Davislor
Jan 21 at 1:09
add a comment |
I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)
– David Carlisle
Jan 20 at 22:59
@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.
– Davislor
Jan 21 at 0:01
@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown
, or the slot in a specific encoding. Loadingfontspec
and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loadingluainputenc
turns it back on.
– Davislor
Jan 21 at 0:12
No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified asusepackage[latin1]{inputenc}
orusepackage[utf8]{inputenc}
or whatever encoding is in use. So¿
is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)
– David Carlisle
Jan 21 at 0:46
@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you loadfontenc
but notfontspec
, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly loadinputenc
orselinput
. If you load bothfontenc
andfontspec
, these characters will not be activated and some of them will break.
– Davislor
Jan 21 at 1:09
I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)
– David Carlisle
Jan 20 at 22:59
I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)
– David Carlisle
Jan 20 at 22:59
@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.
– Davislor
Jan 21 at 0:01
@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.
– Davislor
Jan 21 at 0:01
@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the command
textquestiondown
, or the slot in a specific encoding. Loading fontspec
and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc
turns it back on.– Davislor
Jan 21 at 0:12
@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the command
textquestiondown
, or the slot in a specific encoding. Loading fontspec
and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc
turns it back on.– Davislor
Jan 21 at 0:12
No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as
usepackage[latin1]{inputenc}
or usepackage[utf8]{inputenc}
or whatever encoding is in use. So ¿
is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)– David Carlisle
Jan 21 at 0:46
No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as
usepackage[latin1]{inputenc}
or usepackage[utf8]{inputenc}
or whatever encoding is in use. So ¿
is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)– David Carlisle
Jan 21 at 0:46
@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load
fontenc
but not fontspec
, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc
or selinput
. If you load both fontenc
and fontspec
, these characters will not be activated and some of them will break.– Davislor
Jan 21 at 1:09
@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load
fontenc
but not fontspec
, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc
or selinput
. If you load both fontenc
and fontspec
, these characters will not be activated and some of them will break.– Davislor
Jan 21 at 1:09
add a comment |
Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecoding{T1}selectfont
is the real problem, although that is most commonly activated by
usepackage[T1][fontenc}
so it is simplest to tell people not to use fontenc
.
In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.
The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.
Do you know an example where the hyphenation goes wrong due tofontenc
andT1
? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730
– moewe
Jan 20 at 12:51
@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts
– David Carlisle
Jan 20 at 13:25
I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)
– moewe
Jan 20 at 13:36
1
@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.
– David Carlisle
Feb 5 at 14:20
1
@moewe well naturally it gets better withgrüßen
- in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen
) or bad hyphenation with good output (grüss en
).
– Ulrike Fischer
Feb 5 at 14:30
|
show 4 more comments
Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecoding{T1}selectfont
is the real problem, although that is most commonly activated by
usepackage[T1][fontenc}
so it is simplest to tell people not to use fontenc
.
In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.
The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.
Do you know an example where the hyphenation goes wrong due tofontenc
andT1
? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730
– moewe
Jan 20 at 12:51
@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts
– David Carlisle
Jan 20 at 13:25
I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)
– moewe
Jan 20 at 13:36
1
@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.
– David Carlisle
Feb 5 at 14:20
1
@moewe well naturally it gets better withgrüßen
- in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen
) or bad hyphenation with good output (grüss en
).
– Ulrike Fischer
Feb 5 at 14:30
|
show 4 more comments
Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecoding{T1}selectfont
is the real problem, although that is most commonly activated by
usepackage[T1][fontenc}
so it is simplest to tell people not to use fontenc
.
In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.
The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.
Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecoding{T1}selectfont
is the real problem, although that is most commonly activated by
usepackage[T1][fontenc}
so it is simplest to tell people not to use fontenc
.
In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.
The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.
edited Jan 20 at 12:34
answered Jan 20 at 11:47
David CarlisleDavid Carlisle
489k4111321880
489k4111321880
Do you know an example where the hyphenation goes wrong due tofontenc
andT1
? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730
– moewe
Jan 20 at 12:51
@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts
– David Carlisle
Jan 20 at 13:25
I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)
– moewe
Jan 20 at 13:36
1
@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.
– David Carlisle
Feb 5 at 14:20
1
@moewe well naturally it gets better withgrüßen
- in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen
) or bad hyphenation with good output (grüss en
).
– Ulrike Fischer
Feb 5 at 14:30
|
show 4 more comments
Do you know an example where the hyphenation goes wrong due tofontenc
andT1
? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730
– moewe
Jan 20 at 12:51
@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts
– David Carlisle
Jan 20 at 13:25
I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)
– moewe
Jan 20 at 13:36
1
@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.
– David Carlisle
Feb 5 at 14:20
1
@moewe well naturally it gets better withgrüßen
- in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen
) or bad hyphenation with good output (grüss en
).
– Ulrike Fischer
Feb 5 at 14:30
Do you know an example where the hyphenation goes wrong due to
fontenc
and T1
? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730– moewe
Jan 20 at 12:51
Do you know an example where the hyphenation goes wrong due to
fontenc
and T1
? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730– moewe
Jan 20 at 12:51
@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts
– David Carlisle
Jan 20 at 13:25
@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts
– David Carlisle
Jan 20 at 13:25
I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)
– moewe
Jan 20 at 13:36
I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)
– moewe
Jan 20 at 13:36
1
1
@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.
– David Carlisle
Feb 5 at 14:20
@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.
– David Carlisle
Feb 5 at 14:20
1
1
@moewe well naturally it gets better with
grüßen
- in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen
) or bad hyphenation with good output (grüss en
).– Ulrike Fischer
Feb 5 at 14:30
@moewe well naturally it gets better with
grüßen
- in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen
) or bad hyphenation with good output (grüss en
).– Ulrike Fischer
Feb 5 at 14:30
|
show 4 more comments
Thanks for contributing an answer to TeX - LaTeX Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f470976%2fare-there-cases-where-fontenc-luatex-or-xetex-cause-problems%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Great first question! I learned some new things by researching the answer. Thanks.
– Davislor
Jan 20 at 4:21
examples how to use both traditional font encodings and OpenType fonts in same document are given there
– user4686
Jan 20 at 8:54